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Part I 


Faces: Linear Algebra Through Facial 
Recognition 


Chapter 1 


Week 1a: Facial Recognition 





Schedule 
1.1 Welcome and Orientation [10 mins] ............ 2.200 ee eee eee ene 3 
a2 Coding a face [40 mins]. : 6 ea De ee ee ee 3 
1.3 Facial Reconstruction [zo mins] ........... 2.2.02 eee eee eee eens 4 
1.4 What did we learn? [15 mins]... 2... 2... 2 2 ee ee es 5 
1.5 Course Logistics and Survey [15 mins]... 1... 0... 0... ee eee eee eee 5 





1.1 Welcome and Orientation [10 mins] 


Welcome to QEA Module One! In this module, we will explore linear algebra by applying fundamental ideas 
to data in general and image recognition in particular. By the end of this module you will have learned the 
essential ingredients of linear algebra, implemented a facial recognition method, and applied the ideas more 
broadly in a short project. Let’s first imagine how a computer “sees" an image using numbers ... 


1.2 Coding a face [30 mins] 


First, we would like you to go to this link. Once you are there, please open the document “page1-X” that 
corresponds to your breakout room number. In that document you will find a smiley face that is unique to 
your breakout room. (If your breakout room number is greater than 8, please subtract 8 from your room 
number and use that face.) Your first activity today will be to imagine converting this face into a form that a 
computer can understand. A grid is superimposed on the face for your reference. 


Exercise 1.1 


Design a method that enables a computer to (approximately) reproduce the face from a list of numbers 
and an algorithm that you define. The numbers can be grouped within the list, but your list should 
contain numbers only. An example of a list containing two groups of numbers is [[0, 100], [14, 20]]. 
(The numbers do not need to be grouped, for example [4, 3,77] would work.) You will create an 


algorithm (in other words, a very specific set of instructions) that tells the computer what to do with 
your list of numbers. When another group applies your algorithm to the list of numbers, they should 
be able to recreate the face. When you’ve defined your group’s method, 


* Go to the Google Slides presentation here and find the slide for your breakout room. The slides 
will look like Figure 1.2. 


» Generate the list of numbers that represents your face using your method. 
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+ Make a set of instructions (your algorithm) and write them in the text box area of your slide. 





CO 


6 Smile Algorithms x @ EG Present -| 


File Edit View insert Format Slide Arrange Tools Adi 





ten ve? Q- REMAN Oo 


(ame Example Algorithm 


1. Step 1 
2. Step 2 
3. etc. 


Each breakout 
room has a slide i 

e Each breakout room should 
where they will .| write their smile reconstruction 
write their “algorithm” in this text box on 


3 their slide. 
“algorithm” 








Figure 1.2: Algorithms developed by each breakout group should be written on the appropriate Google Slide 
within the text box. 


1.3. Facial Reconstruction [20 mins] 


Now that you have represented your face in algorithmic form, you will attempt to reconstruct a new face 
using another group’s algorithm! While you are trying to reconstruct the face, think about the approach the 
other group took, and how it differs from your own. 


Exercise 1.2 


You will try to recreate the face using the algorithm from the next (higher-numbered) breakout 
room (the highest numbered breakout room should wrap around to reconstruct room 1!). 


In your breakout room, have one person share the google slide deck using the "Share Screen". 


Reconstruct the face collaboratively using the Zoom annotate tools, as shown in Figure 1.4. 
Using the annotate tools, draw directly on the 8x8 grid next to the algorithm text. 


Record any challenges you encounter directly on their algorithm text using the zoom annotate 
tools (text box, highlight, draw, etc.). 


When you have completed reconstructing the face and marking up the other group’s algorithm 
using the annotate tools, save your share screen using the save button as shown in Figure 1.4. 
(Any marks you make within the Zoom annotation will not automatically save on the Google 
slide, so you have to take a screen shot to share your work.) 


Paste the picture you just saved of the reconstructed face in the next slide of the Google slides, 
titled “Reconstruction from Group X Algorithm.” This way you can share your work! 
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Use Zoom Annotate tools to Save your collaborative 
collaboratively follow the reconstruction, and drop the 
“algorithm” and reconstruct the image into the Google Slides 


smile 









p Great Lakes TRAD... MM Back ML QEA [J The Best Cow Ber 





FF Subject @ Breakwater Sports 





=o 
Apps ME Accessibilty ML isim @ [Run SpeciticExercis New England Open 






6 SmileAlgorithms x @ 


File Edit View Insert Format Slide Arrange Tools Add-ons Help Allchangess 





tr nae Q- RMB an- Background Layoute 


(alee Example Algorithm 


4. Step 1 
2. Step 2 
3. ete. 





is Each breakout room should 
write their smile reconstruction 


“algorithm” in this text box on 
their slide 





Figure 1.4: Use the annotate tools in Zoom to collaboratively reconstruct a face on the 8x8 grid using the 
algorithm instructions. 


1.4 What did we learn? [15 mins] 


Exercise 1.3 


In your breakout room, please discuss the following prompts: 
Please quickly scan through the google slides document to see how things went in general. 
Was the algorithm you crafted successful? What feedback did the other group leave for you? 


How did the other group’s algorithm differ from yours? How successful were you in recon- 
structing their face? 


What components of the algorithm were essential (i.e. an origin, a cell numbering scheme, 
etc.)? 


Consider a photo of a human face. In what ways might your method contain inherent limits 
or biases? 





1.5 Course Logistics and Survey [15 mins] 


CHAPTER 1. 


Solution 1.1 


Room 3 Algorithm 


" Define cartesian coordinate system 0,0 - 8,8. 
Define color (0,0,255 solid blue). 
Origin of circle = 4,4 
x=4, y=4, r=3.5 
Draw circle as x*2+y42 = r42 
Draw quarter circle between corners of the mouth 
(coordinates = ?,?), r=2 
Draw eyes 3,5 & 5,5.. 
8. Good luck we didn’t have enough time to finish 
our pseudocode.... 

9. n facial recognition, you'd want to check if face 

recognized matches the master. 

10. Use conditional logic & boolean to track point by 
point matching? 


Parens 


Fa 





Challenges 


E.g., (3, 5, 7, 9, 11, 13] 


Figure 1.1: Example of an algorithm 


Solution 1.2 


Reconstruction from Room 3 Algorithm 
= a 


Challenges 
Confusion over coordinate system 


Confusion from x=4, y=4 after origin 
was already specified 


Confusion as to 3,5 and 5,5 being 
coordinates rather than the numbers 
“3 and a half’, “5 and a half’, esp. 
given international differences in use 
of comma 


Figure 1.3: Example of a reconstruction 
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Room 3 Algorithm 


List of Numbers 


Eg. 3.5.7.9, 


11, 13) 






ye enough time to finish 


to check it face 



































10 


Chapter 2 


Week 1b: Facespace 





Schedule 
2.1 Pixel Arithmetic [10 mins] ........... 2.0.2 eee eee ee ee ee eee 7 
2.2 A Universal Set of Building Block Images [10 mins] .................. 11 
2.3 A Better Set of Building Blocks? [30 mins] ...............0000000. 12 
2.4 Towards an Optimal Basis [30 mins] ..... 2... 0... ee eee ee ee ee 16 





Last class we thought about various ways to represent an image (e.g., a picture of a face). Today we’re 
going to narrow in on a particular method of representing images: as a weighted sum of a set of building 
block images. You'll be working through exercises that will show how this type of representation works and 
why it is so powerful. 


2.1 Pixel Arithmetic [10 mins] 


Adding is one of the most basic operations in mathematics. While everyone here is familiar with the concept 
of adding numbers, we can generalize this idea to add together other sorts of entities. We can even think 
about what it means to add two images together. 

As a simple example, let’s add the following two images together (we'll explain more precisely how we 


11 


CHAPTER 2. WEEK 1B: FACESPACE 12 


are defining addition of images once you’ve seen the result). 


























































































































































































































Conceptually, this operation might seem straightforward. Adding two images results in an image that 
has a black pixel whenever either of the two images has a black pixel at a corresponding position. 

More formally, we can think about black pixels as having a value of 255 and white pixels as having a 
value of 0 (gray pixels would have a value between these two values depending on how dark they are). (A 
scale from 0 to 255 seems like a weird choice, but there is a very good reason why this is the standard - 
digital storage, such as on your computer, uses binary (bit) - how many integers can you represent with an 
8-bit number?) To add two images together, all we do is add the corresponding elements at a particular point 
in the grid! In this way addition on images works much the same as addition of a single number-the only 
difference is we perform the addition of single numbers multiple times for each position in the grid. 

Here is the preceding example of adding images but we’ve shown some of the intermediate steps. 
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Exercise 2.1 


With your group, work through the following pixel arithmetic problems on the board. 
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Use Zoom Whiteboard to fill in your answer 









































~~ 





Use Zoom Whiteboard to fill in your answer 





Without too much of a leap, we can also multiply images by a number by multiplying each element in 
the image by that value. We can think of this multiplication operation as “scaling” the image. 


For example, 
0.5 x S = 


Exercise 2.2 

















With your group, work through the following pixel arithmetic problems on the Zoom white board. 


"- 
oa 


1. 






































Use Zoom Whiteboard to fill in your answer 






































Use Zoom Whiteboard to fill in your answer 


3. (Don’t think about this one too hard. Just draw approximately what this would be) 

















0.9999 x + 0.0001 x . a 























Use Zoom Whiteboard to fill in your answer 
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2.2 A Universal Set of Building Block Images [10 mins] 


Now that we know how to add and scale images, let’s think about how we might construct a set of building 
block images such that we can construct any image as a sum of scaled versions of these building blocks. 


Exercise 2.3 


With your group, work through the following problems. 


1. What is the range of images that could be constructed by summing over scaled versions of 
the following building block images? (c is a number between 0 and 1). Another way to think 
about this is, as you sweep the value of c from o to 1, how does the resultant sum of the two 
images change? 

















+(1-—c)x 








2. What is the range of images that could be constructed by summing over scaled versions of 
the following building block images? (a and b are both numbers between o and 1). Instead of 
having one knob to turn (as in the previous exercise), you now have two. 





















































































































































In this case we can think of the values a and b as encoding of a particular smiley face. You will 
deduce the effect that both a and b have on the specific nature of the smiley face. 


. Building on the previous example, come up with your own way of representing a simple face 
like the one above as the sum of two or more scaled building block images. Be creative! It’s up 
to you what sort of faces that your method is capable of representing. 





You probably noticed from the previous three exercises that not all possible images can be constructed 
by adding scaled versions from a small set of building block images. Suppose you wanted to be able to 
represent any possible 3 pixel by 3 pixel image. While there are many possible ways to do this, for simplicity 
we can define each building block image to only have a single black pixel (the rest should be white). We 
could represent nay 3 pixel by 3 pixel image as a sum of the following building blocks. 
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Cl + C2 + C3 
































cl + C5 Ss + C6 | 












































C7 + Cg + C9 
ed = a 


Here, c1,...,€ 9 represent the amount of each building block we use of a particular building block image. 
If we assume the building blocks are fixed and known to us in advance, we can say that to encode each 
image we need 9 numbers (ci, c2,..., Cg). You should also convince yourself that any 3 pixel by 3 pixel 
image could be represented as a sum of scaled versions of these building block images. 
































2.3 A Better Set of Building Blocks? [30 mins] 


At the end of the previous section we showed how one can represent any possible image as a sum of scaled 
single-pixel images. This is a very powerful idea, but we can take it even farther. Before we continue, let’s 
think about some of the ways in which this way of representing face images is not so great. 


Exercise 2.4 
Suppose you wish to represent 19 pixel by 19 pixel images of faces using the scheme you devised in 
the previous set of exercises (as a sum of scaled, single-pixel images). Here is an example of what 
such a face might look like. 


1. If you think of the representation of each image as the scaling factor that you apply to each 
of your single-pixel images, how many numbers do you need to specify this one face image 
(you answered almost this exact question in the previous part, so don’t overthink this). As 
before, when calculating how many numbers you need to specify a face image, don’t include 
the building block images themselves (just the scale factors applied to each building block). 


. Does your answer to the previous part depend on the fact that you are encoding a face image 
(e.g., what if the question had been asked about an image of a flower or a completely random 
19 pixel by 19 images (one with no special structure)? 


. Suppose someone gives you one of the numbers needed to encode a particular face? Without 
looking at the face image itself, how much information (e.g., age, identity, sex, gender, etc.) 
could you determine about the face just from that one number? 





As you probably noticed in the previous exercise, a major drawback of the encoding we worked out 
previously is that each scaling factor doesn’t tell us all that much useful information about each face (and as 
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a result we need a lot of these numbers to specify a particular face). It turns out that we can fix a lot of these 
shortcomings by carefully choosing our set of building block images. Reframing problems by choosing a 
different set of building blocks is one of the key ideas in this module. 

One member of your group should go to this Google Slides presentation. Make a copy of the presentation, 
set the sharing so it is editable with a link, and copy the link into the Zoom chat window (here is a video 
walkthrough of copying the slide presentation and getting a shareable link). Now, each group member should 
go to the link. 

What you see before you is a very carefully chosen set of building block images. You should notice that 
each column represents a different building block image and each row represents a different scaled version 
of that same building block. Today, we won’t be going into detail about how we determined these particular 
building blocks. Instead, you will experiment with these building blocks in order to understand some of 
their properties. 


* You can add these scaled building block images by overlapping them on the Google Slides presentation. 
Before assembling a face, duplicate the slide with all of the building blocks (an example of how to do 
this for the first face in the table below is shown in this video). 


» Along with these building block images, we have determined optimal encodings for a number of 
different faces. Pick a few of these faces and try assembling them. Take turns so each member has a 
chance to try it. 


Have one of your group members choose a face and create its encoding (don’t tell the rest of your 
group who you picked) The other members should try to guess which face it is. 


Note: that each column in the table corresponds to one of the building block images (column in the Slides 
presentation). Higher numbers in the table correspond to choosing the darker (more saturated) versions of 
each building block image. If a o appears for a particular building block, don’t include that building block 
at all to construct a particular face. 


Intensity 1 | Intensity 2 | Intensity 3 | Intensity 4 | Intensity 5 | Intensity 6 face image 














3 3 ) ) 2 2 
() 2 3 2 3 1 
3 () 1 4 1 1 
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How many numbers do you now need to encode a 19 pixel by 19 pixel face? (For the purposes of this 
problem, don’t count the numbers needed to encode each of the building blocks. You can assume those 
are already given to you.) 


Can you encode any possible face with this set of building blocks? If not, what seem to be the 
limitations? 


How well does this set of building blocks work for encoding these faces? Does it seem to work equally 
well across all faces? Which faces does it work well on (i.e., they can accurately be reconstructed from 
the building blocks) and which faces does it work poorly on? 


Looking at the building blocks themselves, what does each building block seem to represent? In other 
words, as you increases the amount of a particular building block, what features or qualities does that 
impart on the resulting face. To help you think this through, below we have a grid of faces where each 
row corresponds with one of the six building block images and each of the faces in the row contains a 
large amount of that particular building block image in its encoding. 
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Each of these faces has a large amount (high scale factor) of the corresponding building block 
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2.4 Towards an Optimal Basis [30 mins] 


Exercise 2.5 


In this question, we want you to think about process rather than particular techniques for solving this 
problem. If you have questions on what we mean by this, let us know. 
Suppose someone has hired you as a consultant to create a method to encode 19 pixel by 19 pixel 


images of faces (similar to the ones you just experimented with) as a sum of scaled versions of just 
10 building block images. 


1. What questions would you want to ask the person that hired you in order to do a good job on 
this project? (i.e., what information do you need to know?) 


2. What might be some qualities of a good set of building block images? (e.g., how would they 
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look? what sort of dimensions of variability would they have?) 


. What sort of data might you need to collect in order to inform the set of building blocks you 
will ultimately deliver (this data could be images or it could be other quantitative or qualitative 
data)? 


. How might you determine whether your method is working (these could be quantitative 
measurements or qualitative observations of your system)? 


. Are there any other steps might you want to take to complete the project? 


. We will be digging into the various dimensions of the use of facial recognition technology in 
society later in this module, but for now we want to get you thinking about two particular 
components of that. Many face processing technologies work best on white males (e.g., check 
out the Gender Shades project). One possible explanation for this phenomenon is overt bias 
on the part of the creators of these technologies. Instead, for the sake of this exercise, let’s 
suppose that the differences in performance are actually the result of subtle, unconscious 


bias in any number of decisions that the technology creators made during the design process. 
A second problem that plagues face processing algorithms is that they seem to work great 
when evaluated in the settings that the technology designers had in mind when they built the 
technology, but often work poorly when deployed in the real world. Looking back on the steps 
you listed above, flag steps that might have the potential to introduce bias into your system 


(e.g., having your system work better on one group of people than another or having it fail in 
a particular use case). It’s okay if you don’t know where bias might creep in, the purpose of 
this exercise is to get you asking questions rather than reaching conclusions. 
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Homework 1: Introduction to Matrices 
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Overview and Orientation 


In this night assignment, we will learn some of the fundamental material about matrices and matrix opera- 
tions. 


? Learning Objectives 


Concepts 


« Define a vector, a matrix and an array 
« Describe the meaning of the dimensions of a vector, a matrix, and an array 


+ Give at least one interpretation of matrix-vector multiplication 


Calculate the product of a matrix-vector multiplication 


Understand dimensionality-requirements for matrix-vector multiplication and predict resulting 
dimensions 


« Define and recognize the following special matrices: Identity, diagonal, square, rectangular, 
symmetric 


MATLAB skills 
« Determine the dimensions of a vector, matrix, or array variable 
« Perform operations (addition, multiplication, transposition) on matrices 


« Extract desired subarrays or matrices from arrays 


22 
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Suggested Approach 


« First you should quickly scan through the assignment, see what is being asked, and assess the extent 
to which you already know how to do things. Spend no more than 30 minutes or so doing this. 


« You should then read the assignment more closely, try out problems, and if appropriate, look at some 
of the other resources that are suggested. Don’t spend more than 1 hour poking around at stuff online 
unless it is really being productive: it’s easy to spend a lot of time there without accomplishing much. 


+ Then start doing the problems in earnest, and/or spend focused time with suggested resources. 


* Once you've spent a total of 3-4 hours working on the assignment, you should check your progress. 
Are you on track to finish within about 7-8 hours? Do you feel confident that you can do the stuff 
that’s left? If not, this is when you should ask for help. This means talk to a colleague, or talk to a 
ninja, or send an email to an instructor. 


* You should turn in a PDF document with answers to all the numbered questions below. For the 
MATLAB assignments, please export your work to pdf. Please carefully label the problem number in 
your MATLAB script. 

Resources to read and watch 


There are lots of books about Linear Algebra and lots of useful videos on the web. Here are some specific 
recommendations: 


« Introduction to Linear Algebra, by Strang 

+ Linear Algebra, by Lay 

« Linear Algebra, by Cherney, Denton, Thomas, Waldron 
« Homebrew videos 


— Matrices operating on vectors 
— Matrices operating on vectors (example) 


— Matrices operating on matrices 
+ Videos from others 


— Vectors, the very basics 


— 3Blue1Brown’s YouTube series on Linear Algebra 


3.1 Vectors 
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Consider the point p = (1,2, —1) in 3-dimensional space. We can associate a position vector v with this 
point, which is the vector from the origin to this point, 


Likewise, we can think of every vector as defining a point, if we assume that the vector emanates from the 
origin. So, for example, the vector 
3 
—2 
0 
1 


is identified with the point (3, —2,0, 1) in 4D. Often times we will mix and match these ideas and say things 
like: the vector (x, y, z). What we really mean when we say this is: the point (x, y, z) can be treated as the 
position vector 


The vector v, as represented above, is called a column vector. We can also have row vectors such as the 
following 


u=([p q r]. 


The operation of converting a column vector to a row vector or vice-versa is called taking the transpose 
of the vector and is denoted with a superscript 7’. For example, the transpose of the row vector u from above 
is 


p 
u =/q¢ (3.1) 
. 


and the transpose of the vector v from above is 


v= [z y z| : (3.2) 


We can take the product of a row vector with a column vector using the following formula 


x 
uv=([p q r]ly| =pr+qyter (3.3) 
Zz 
If we start with two column vectors 
UL Wi 
v2 W2 
v=|.] andw= 
Un Wn 


of length n (ie., they are n-dimensional), then we can take the dot product 
VW =VUyWy, + VQW2 +++ + UnWn- 
In some sense, the dot product is a measure of how aligned two vectors are. Here’s the key formula: 
vw = ||v|lllwl] cos 6 


where 0 is the angle between v and w and 





vl = ok 0B + +02 


is the length of the vector v in n-dimensional space. 
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Exercise 3.1 


. Assume v and w are two vectors of unit length, i-e., ||v|| = ||w|| = 1. Using the formula above, 
what angle between v and w maximizes the magnitude of the dot product? Using the formula 
above, what angle between v and w minimizes the magnitude of the dot product? 


. Compute v - w where 





We'll learn more about the dot product as we go. For now, notice that the dot product equals the product 
of the transpose of one with the other 


v-w=viw. (3.4) 
Vectors can also be used to represent many things, such as data. Linear algebra provides a powerful set 
of tools to manipulate and analyze this data. 


Exercise 3.2 


For instance, you may have a three-dimensional vector f whose entries represent the numbers of 
different fruits you have in your refrigerator. For example, the first entry could be the number of 
oranges, the second the number of grapefruits and the third could be the number of apples. When 
organized in this manner, you can use products of row and column vectors to compute the number 
of different fruits there are. For instance, suppose that 


1 
aoe 
3 


ie. you have 1 orange, 2 grapefruits, and 3 apples in your fridge. 


. Find a row vector t so that the product tf tells you the total number of fruits in your refrigerator. 


. Find a row vector c such that the product cf tells you the total number of citrus fruits in your 
refrigerator. 


. Suppose that in the genetically engineered future, all apples weigh 100 g, all grapefruits weigh 
250 g and all oranges weigh 120 g. Find a row vector w, such that the product wf tells you the 
total weight of fruits in your refrigerator. 





If you wanted to know the vitamin C content of the fruits in your fridge, you could formulate a similar 
vector to compute it. 

In the questions above, you took linear combinations of the entries of the vector f which gave you the 
desired quantity. Linear algebra is the study of linear combinations. 
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3.2 Matrices 


Matrices are a set of numbers organized in a two-dimensional array. Matrices are a compact way to represent 
linear combinations. Matrices can also be used in a number of different ways, such as to represent data. 
When we multiply a matrix by a vector, it results in a new vector. Therefore, when we say "a matrix operates 
on a vector", we mean that the matrix multiplies the vector. Notation-wise, we use bold upper-case letters, 
e.g. A, to represent a matrix and bold lower-case letters to represent a vector, e.g. v. 

For instance, you may define a two-dimensional matrix G with two rows and three columns as follows 


c= |} “ (3.6) 


Matrices and vectors come in different shapes and sizes and we refer to their shape and size by the 
number of rows and columns they have. A general matrix A has m rows and n columns, and we refer to 
this as an m x n matrix. Vectors are then examples of matrices: row vectors have a single row, i.e., they are 
1 x n matrices; and column vectors have a single column, ie., they are m x 1 matrices. 

Matrices can only multiply vectors of a certain size and produce vectors of a certain size: anm x n 
matrix can only operate on a column vector of size n x 1, and will produce an output vector which is a 
column vector of size m x 1. (Likewise, matrices can only multiply other matrices of a certain size: an m x n 
matrix can only act on a matrix of size n x k, and will produce an output matrix of size m x k.) These basic 
properties will become clearer when we look at an example. 

Consider the 3 x 2 matrix A, 


2 
A=]3 -l 
0 


and the input vector v 


— 


2. Tp (2)(—2) + (1)Q) —3 
Lt ]=| @ca+Cya | =| -7 
(0)(—2) + (4)() 4 

There are two main ways to think about this multiplication. The most common view is to treat each entry of 


the new vector as a dot product between a row of the matrix and the column vector. So, for example, the 
first entry in the output vector is the dot product of two vectors 


A} 


The second approach is to view the output vector as a linear combination of the columns of the matrix. The 





entries in the original vector are used as multiplication weights on each column of the matrix, i.e. 


2 1 =3 
(53) a) at ee) ae 
0 4 4 


We encourage you to use both approaches when you think about multiplication. 


Exercise 3.3 


Recall the matrix G defined in equation (3.6) and the vector f defined in Exercise 3.2, which kept 
track of the number of fruit of different types. What does the vector Gf represent? 
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Exercise 3.4 


If a matrix multiplies a spatial vector, the resulting vector is transformed by the matrix, resulting in a 
new vector. 


1. Please draw the spatial vector 


. What happened to v when you multiplied by A? 
. Please draw the vector u = Bv, where B is 
B- cos(30°) —sin(30°) 
~ }sin(30°) — cos(30°) 
. What happened to v when you multiplied by B? 


. Please draw the vector t = Rv, where R is 


_ |cos@? —sind 
~ |sin@ — cosé 


. What happened to v when you multiplied by R? 


. Please draw a new spatial vector 


. Please draw the vector s = Rw 


. What does multiplying any vector by R do? 
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acos@ 





Y = J 
(ii) 





Figure 3.1: Rotation of vectors 


(3.12) 


You may have guessed that R defined above, rotates a vector counter-clockwise by 9. This is indeed 
rotation matrix, consider Figure 3.1 (i). Suppose that we wish to rotate the vector p counter-clockwise by 0, 


true, and R is called a rotation matrix as it transforms vectors by rotating them. To understand why R is a 


which will result in the vector q. From the figure, we see that 
_ la 
P= 43 


(3.13) 


and p is the sum of the gray and blue vectors. If we now rotate the blue and gray vectors counter-clockwise 
by 0, we see that q is the sum of the rotated versions of the blue and gray vectors, as shown in Figure 3.1 (ii) 


By using trigonometry, we see that the blue vector in Figure 3.1 (ii) is 
é cos ) 


asin@ 
and the gray vector in Figure 3.1 (ii) is 
—bsin@ Ga 
bcos @ on 
(3.15) 


acos@ — bsin@ 


) = Cee = Rp. 


Therefore, q is given by 
_ facosé 4 —bsiné 
I~ \ asind bcosé 
As we mentioned earlier, m x n matrices can multiply n x 1 vectors and produce m x 1 vectors. Consider 


a generic m X n matrix A 
a12 0 *** Gin 
a22  *** Gan 


A=]. 
eee Amn 


where the 7j-th entry of this matrix, a;; defined above, is the entry corresponding to the 7-th row and j-th 


column. You can multiply an n x 1 vector v by this matrix. Define the vector v as follows, 
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Now define another vector w which is the product of A and v, ie. w = Av. If we define 
Wi 
Ww2 
tin 


then the 7-th entry of w, is given by the following sum 
n 
Wi = Ai1V1 + Gj2V2 +++ AimUm = S QijUj 
j=l 


Besides multiplication, a number of other operations can be done using matrices including addition, 
subtraction, inversion, transposition, etc. We will explore more of these and their associated properties now 
and later. All of these operations make matrices a very powerful tool in the study of many different systems 
which can be represented as linear transformations, or combinations. 


3-3 Addition, subtraction, multiplication, and transpose of matri- 
ces 
We can add matrices of the same size, and subtract them from one another. Both operations result in matrices 


of the same size and shape. The addition and subtraction operations are done element-wise. For instance the 
difference of the two matrices can be calculated as below 


afd 0 

B=() : (3.17) 

aoe [83 G28 OH on 
2 


ah a ra (3.19) 


Multiplying a matrix by a scalar simply scales each entry of the matrix by the scale factor. For instance 


(3.20) 


9 12 3 
a= [5 3 | 


The transpose of a vector, denoted by the superscript 7 turns a column vector into a row vector, and vice 
versa. For matrices, the transpose replaces the rows with the columns (or vice-versa). For example, 


13 5]2 1 2 
27 6 = 3.67 (3.21) 
5 6 


Since the columns are replaced with the rows, the shape of the matrix changes when you transpose it. The 
following property of transposes will be useful moving forward. Consider a matrix A and a vector v. Then 


(Av)? =v? aT (3-22) 
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Exercise 3.5 
Using A and B previously defined, evaluate 4A — 5B 


Exercise 3.6 


If the matrix A has dimensions of 4 x 5, what are the dimensions of A?? 


Exercise 3.7 


If the matrix A is 4 x 5 (i.e., A has dimensions 4 x 5) and the vector v is 5 x 1, what are the 
dimensions of Av and (Av)? ? 


—E———————————— el 


Matrices can be multiplied together to produce other matrices. In general, when you multiply a matrix 
A with another matrix B, you need the matrix on the left side of the product to have the same number of 
columns as the number of rows in the matrix on the right side. In other words if A is m x n, and B is p x q, 
you need n = p for the product C = AB to be defined. The product results in a new matrix C which is 
m x q. The g columns of the product matrix C are precisely the g vectors that would result from multiplying 
A with the vectors formed by the columns of B. 

Consider the following matrices 


Alpe) Blah 
The product of the two C = AB is computed as follows 
CH= | 2 1 | | 1 5 | _ | (2)(1) + (1)(—2) (2)(5) + (1)(3) | _ | 0 13 
3-1 —2 3 (3)(1) + (—1)(—2)  (8)(5) + (—1)(8) 5 12 
As a second example consider the matrices A and B defined below, and let the product C = AB. 


1 
As (3 B=() Hl 
4 





(1)(1) + (2)(2) 

C = | (3)() + (2)(2) 

(4)(1) + CQ) 

As mentioned above, one way of envisioning matrix multiplication is if we consider the columns of input 

matrix B as a set of column vectors, we can multiply these column vectors one at a time by the matrix A, 
and the resulting vectors will be the corresponding columns of the output matrix C, ice. 


2 

2 

1 
(2)(2)  (1)(4) + (2)(8) 5 10 
(3)(4) + (2)(3)} = ]7 18 


(4)(4) + (1)(3) 6 19 


wa DS wa 








AB = A[B,,Bo,...] = [AB, ABo,.. | 


where B, is the first column of matrix B etc. 
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Consider the following matrices: 


Exercise 3.8 


Find the matrix product AB. 


Exercise 3.9 
Find the matrix product BA 


a ) 


Note that these two products are NOT equal. In general, matrix multiplication, unlike scalar multiplication, 
is NOT commutative. In other words, in general AB # BA. However, the distributive property IS valid 
for matrices: A(B + C) = AB + AC so long as we keep the order of the multiplication the same 
(B+ C)A = BA+ CA. Recall the definition of matrix addition: if two matrices are of the same size then 
they can be added and each entry of the new matrix is the sum of the entries of the original matrices, e.g. 


Pe Chore 


In addition to matrices A and B defined above, consider the matrix 


oie ae 





Exercise 3.10 
Calculate A(B + C). 


Exercise 3.11 


Calculate AB + AC. Is it equal to your previous answer? 


(aa) 


Finally, since matrix multiplication is defined, there is no reason not to multiply a matrix by itself. This 
only works if it is a square matrix. (Think about why this is true.) Using A and B from above, evaluate the 
following expressions 
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Exercise 3.12 


Exercise 3.13 


There are lots of matrices that are special. Use a trusted linear algebra reference to define the 
following types of matrices, and provide an example of each: 


. Square Matrix 

. Rectangular Matrix 
. Diagonal Matrix 

. Identity Matrix 


. Symmetric Matrix 





When matrices operate on (i-e., multiply) spatial position vectors, the vector which results is another 
spatial position vector. The original spatial position has been ’transformed’ into another position. In 
particular, there are specific matrices which accomplish specific desired transformations. These are used in 
many different disciplines. 


The matrix 
1 0 0 
I=] 0 1 0 (3.24) 
0 0 1 
when multiplying the vector 
x 
v= 10 (3.25) 
z 


will reproduce the same vector, ie. Iv = v. For this reason, the matrix I above is called an identity matrix. 
Identity matrices in higher dimensions are defined the same way, ie., a 4-dimensional identity matrix is a 
4 x 4 matrix with 1s on the diagonal and zeros everywhere else, ie., 


(3.26) 


ocooor 
oOrRO 
feo ed ae =P) 
a 


Exercise 3.14 
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1. Another important and simple operation is to be able to take a vector and scale (increase 
or decrease its length) it by an overall multiplicative factor while maintaining its direction. 
Consider the vector 


Thinking about how the identity matrix acts on this vector, propose a 3 x 3 matrix which 
scales this vector by a factor of 3 to the vector 


3z 
In other words, find a 3 x 3 matrix M such that Mv = 3v for any vector v. 


. What if you want to scale the component differently than the y component? Write down 
the 3 x 3 matrix which scales the 2 component by 3 and the y component by 5 and leaves the 
z component the same. 


. Write down the 3 x 3 matrix which scales the x component by a, the y component by 6, and 
the z component by c. 





3.4 Matrix Operations in MATLAB 


Now you will work on examples of matrices multiplying vectors to get yourselves comfortable with matrix 
operations in MATLAB. First, let’s define the matrix A using MATLAB as follows 


>> A = [2 1; 3 -1; 0 4] 


Note that the semi-colon ends a row and begins a new row. To define the column vector v in MATLAB you 
can type the following command: 


>> v= [-2; 1] 
whilst to define the row vector u in MATLAB you can type the following command 
>> u= [2 -3 1] 


Notice that in this case each component of the vector is separated by a space - you could also separate them 
with a comma. 


Exercise 3.15 
Using the definitions for A, v, and u from above, please solve the following using MATLAB. Do the 


answers match what you expect? (Not all of these may be defined!) 


es JAY 


2. U*A 
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3.5 Datain Matrices and Vectors 


Most of the examples you saw up to now in this assignment involved vectors which represent spatial 
positions, and most of the matrices you encountered represent transformations of the spatial vectors. But, as 
you saw with the example involving fruits, vectors can also be used to store data. So can matrices. 

For instance, you may have the following matrix 


ke 35 37 al et 


49 40 48 61 


whose first row represents the forecasted high temperature in Needham for the next 4 days (as of the day 
this was written) and the second row represents the forecasted high temperatures for Washington DC. By 
representing this data in matrix form, you can do a number of operations to help extract useful information 
from the data. 


Exercise 3.16 


For this exercise, you will work with historical temperature data for the cities of Boston, Providence, 
Washington DC and New York. 


1. Download the file temps.mat from canvas and load the data in it into MATLAB using » load 
temps.mat. You should now have access to a matrix T which contains daily average 
temperatures from 1995 to 2015 for the cities of Boston, Providence, Washington DC and New 
York (we are not telling you in what order yet). By using MATLAB’s size function, determine 
the dimensions of this matrix. Are the temperatures for each city contained in the rows or the 
columns of this matrix? 


. Extract the temperatures for each city into 4 different vectorst1, t2, t3, t4,and check 
that the dimensions of these vectors are as expected. 


. Find the average temperature of each city using MATLAB’s mean function, and guess, based 
on geography, which of the vectors corresponds to the temperature for which city. 


. What are the maximum and minimum temperatures for Boston in the 20 years for which you 
have data? As you might expect there are MATLAB functions called max and min. 


. On the same axes, plot graphs for the daily temperatures for the four cities for the last year for 
which you have data. Use MATLAB’s legend, xlabel, ylabe1 functions to label 
the graphs. 





CHAPTER 3. HOMEWORK 1: INTRODUCTION TO MATRICES 35 


6. Suppose that a genie told you that you can guess the temperature of New York, which we call 
T;,, using the temperatures of Boston, Providence, and Washington DC, which we respectively 
call Ty, T, and T,,. From the matrix T, extract a 3 x 365 matrix of daily temperatures for the 
last year (for which you have data) in Boston, Providence and Washington DC. 


. The genie says that a good approximation for the temperature on a given day in New York is 
given by 


Ty © 0.22357), + 0.41937, + 0.3856Tp. (3.28) 


Formulate a matrix equation which uses the matrix from the previous part and the formula 
from the genie to guess the daily temperature in New York for the last year. Apply this equation 
in MATLAB. 


. On the same axes, plot your prediction for the temperature in New York from the previous 
part, and the true temperature data which you extract from T. Is the prediction close? 


In the course of this module, you will learn how to come up with the coefficients we provided here 
using historical data. (No, we don’t actually have a genie.) 


CO 


3-6 Conceptual Quiz 





Please figure out the answer to these questions and mark your answer in Canvas. You can retake the quiz, as 
needed. 


1. Aisa3 x 4 matrix and B is a 4 x 2 matrix. What is the size of AB? 


A.2x3 
B.3 x1 
C.3 x 2 
D. The product is not defined. 


2. What is the result of the following matrix product 


he 2s 8 4 2 
2 -2 4 —3 -3 
3-4 5 1 1 


A. 5) i 
18 14 

24 23 

B. en) 
18 14 

29 23 
C —5 18 24 
=f 1A 28 

D =e 3 
18 14 6 
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3. Match the following items (* means any number): 


1 0 0 
1. Rectangular Matrix A.|O 1 0 
00 1 
1 2 3 
2. Diagonal Matrix B.|2 4 5 
3.5 6 
*« 0 0 
3. Identity Matrix C.|0 * O 
0 0 x 
# * 
4. Symmetric Matrix D. |* x 
* Ox 


4. Which of the following matrices will scale the length of any 2-D vector by 4 ? 


A. 
ta 
0 3 
B. 
1 
0 oe 
V2 
C. 
+ 0 
0 1 
D. 


| 


5. All of the following vectors are unit length. In which picture is v - w the largest? 


NIRNIR 
———— 


NIRNIFE 


A. 
y 
w 
x 
B. 
y 
w 
x 
Vv 
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Solution 3.1 


1. Using the formula, v - w = cos(0). So, when 6 = 0, (ie., the vectors point in the same 
direction) the magnitude of the dot product is maximized and when 6 = 77/2 (i.e., the vectors 
are perpendicular) the magnitude of the dot product is minimized. 


2. The dot product is 
vew=-24+0-44+18=12 


Solution 3.2 


1. Lett = [1 1 1; Then tf = 1+ 2+ 3 = 6, the total number of fruits in your refrigerator. 


2. Letc = [1 1 0]. Then cf = 1+ 2+ 0 = 3, the total number of citrus fruits in your 
refrigerator. 


3. Let w = [120 250 100]. Then wf = 120+ 500 + 300 = 920, the total weight of the fruits 


in your refrigerator. 


Solution 3.3 


The vector Gf is a 2 x 1 vector whose first entry represents the total number of fruits and second 
entry represents the number of citrus fruits. 


Solution 3.4 


1. The vector v is 


2. First, we compute 


which is visually represented as 


3. Multiplying v by A rotated the vector counterclockwise by 45 degrees. 


4. First we compute 


which is visually represented as 
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5. Multiplying v by B rotated the vector counterclockwise by 30 degrees. 


6. First we compute 


sin 0 


y 
0 
x 


7. Multiplying v by R rotated the vector counterclockwise by 6 degrees. 


y 
Ww 
x 


cos @ — sin 4 


ee iS | 


which is visually represented as 


8. The vector w is 


9. First we compute 


sin @ + cos 6 


y 
Ww 
x 


10. Multiplying any vector by R rotates it by 0. 


s=Rw=| 


which is visually represented 


Solution 3.5 


(3.23) 


Solution 3.6 


The dimensions of AZ are 5 x 4. 


Solution 3.7 
Av is 4 x land (Av)? is 1 x 4. 


Solution 3.8 


so-[~4 3] 


—3 -3 
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Solution 3.9 


Ba=|~") | 


2 —-7 
Solution 3.10 
-12 3 


A(B +0) = ie Al 


Solution 3.11 


It is the same answer, as expected, since you can distribute matrices. 


Solution 3.12 


5 [4> A 
walp a 


3__ [152 —72 
oS, By 8 


Solution 3.13 


1. A square matrix is one that has size n x n, eg., 


2. A rectangular matrix is one that has size m x n where n is not equal to m, e.g., 


* ok O* 
BO 


3. A diagonal matrix is one whose only non-zero elements are on the diagonal from upper left to 
lower right, e.g., 


2 0 0 
0 3 0 
0 0 8 


4. The identity matrix is a square matrix with all zeroes except along the diagonal from the upper 
left to lower right, where the entries are all 1, e.g., 


1 0 0 
0 1 0 
00 1 
5. A matrix is symmetric if it is square and equal to its own transpose, i.e. A = A”, e.g., 


1 
7 4 -5 
3 
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Solution 3.14 


3.0 0 
M=j]0 3 0 

0 0 38 

3.0 0 
M=j]| 0 5 

0 0 1 

. We can generalize the result: 

a 0 0 
M=j]| 0 6b 

0 0 ¢ 


Solution 3.15 
ess e8 5-4] 
_[-5 9] 


. Does not work because the inner matrix dimensions must agree and here we have a 3 x 2 
matrix multiplied by a 1 x 3 matrix 


. Does not work because the inner matrix dimensions must agree and here we have a 2 x 1 
matrix multiplied by a 3 x 2 matrix 


. [-33; -7] 
.9 


. Does not work because the index exceeds matrix dimensions. It is trying to access columns 
2-4 of a two column matrix. 


. Does not work because the inner matrix dimensions must agree and here we have a 1 x 3 
matrix multiplied by a 1 x 2 matrix. 


Solution 3.16 


. After loading the temperatures you can see that they are stored inside a matrix called T which 
has 4 rows and 7670 columns, so presumably the temperature for each city is stored in a row. 


. We can extract the first temperature by typing the following » t1 = T(1,:) - this simply 
grabs all of the elements in the first row, so that t 1 should be a row vector of size 1 by 7670. 
We create the other vectors in a similar way. 


. We can take the mean of the first city by typing » mean(t1) and we get 51.7667. The other 
means respectively are 51.9140, 58.4365, and 55.9451. A little bit of geography suggests that 
the cities are ordered as follows: Boston, Providence, DC, New York. 


. We can compute the maximum by typing » max(t1) and we get 90.7. The minimum is 0.7. 


. We are only supposed to grab the last year (365 days) so for Boston we would type » 
plot (t1(end-364: end) ), or we could use the actual size of the vector. 


. Boston, Providence, and DC are stored in the first three rows. We'll extract their data and 
store it in a new matrix S by typing » S = T(1:3,end-364: end), which grabs the 
first three rows and the last 365 entries. 
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7. We can define Tn by typing » Tn = 0.2235*S(1,:) + 0.4193*S(2,:) + 
0.3856*S(3, :) since the city temperatures are stored in the each of the three rows of 
the matrix S. 


8. Graphically they look pretty good. We can also examine the data a little more closely by 
looking at the difference between the predicted temperature and the actual temperature - it 
fluctuates with a mean of roughly 8.8115e-04, a maximum of 7.5334, and a minimum of -6.8966. 
Compared to the actual temperatures this implies that the prediction is never any worse than 
roughly 10%. 
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Week 2a: Matrix Transformations 





Schedule 
4.1 Debrief [15 minutes] ........... 0. cece ee et ee es 39 
42 Synthesis [sa minutes]... 2. 6 eee ee eae 39 
4.3 2D Rotation Matrices [45 minutes] ............ 2.2.0.2 eee eee eens 40 





4.1 Debrief [15 minutes] 


Exercise 4.1 


. In your breakout room, identify a list of key concepts/take home messages/things you learned 


in the assignment. Try to group them in categories like "Concepts", "Technical Details", "Matlab", 
etc. 


. Try to resolve your confusions with your breakout room-mates and by talking to an instructor. 





4.2 Synthesis [30 minutes] 


Exercise 4.2 


These are fundamental ideas about matrices and it is important to complete these. 
. What is the difference between a scalar, a vector, a matrix, and an array? 
. What are the rules for adding matrices? 
. When can two matrices be multiplied, and what is the size of the output? 
. What is the distributive property for matrix multiplication? 


. What is the associative property for matrix multiplication? 


. What is the commutative property for matrix multiplication? 
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a) 


Exercise 4.3 
These are synthesis problems. It would be helpful to complete these. 


1. Use the distribution law to expand (A+B)? assuming that A and B are matrices of appropriate 
size. How does this compare to the situation for real numbers? 


. Show that D = E By satisfies the matrix equation D? — D — 6I = 0. 


3. Let A be a square matrix. Show that A? commutes with A. 


Exercise 4.4 


These are challenge problems. Pick one of them to wrestle with. It is not important to complete 
these. 


1. The matrix exponential is defined by the power series 


2 0 
0 3 


Assume A = | 


. Find a formula for exp A. 


2. The real number 0 has just one square root: 0. Show, however, that the 2 x 2 zero matrix has 
infinitely many square roots by finding all 2 x 2 matrices A such that A? = 0. 


3. Use induction to prove that A” commutes with A for any square matrix A and positive integer 
n. 


Co 


4.3 2D Rotation Matrices [45 minutes | 





We're going to think about how to use rotation matrices to rotate a geometrical object. In doing so we will 
solidify fundamental concepts around matrix multiplication and start to explore the notion of “inverse”. For 
clarity we will first work in 2D. Recall that the rotation matrix R(0): 


sin@  cos@ 


R(0) = bee ey 


will rotate an object counterclockwise about the origin through an angle of 0. 


Exercise 4.5 
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This is a hands-on, conceptual problem involving the multiplication of 2D rotation matrices. 


. Place an object on your table, and imagine that the origin of an xy-coordinate system is at the 
center of your object with +z pointing upwards. 


. Rotate it counterclockwise by 30 degrees. How would you undo this rotation? 


. Starting again, rotate it counterclockwise by 30 degrees, and then again by another 60 degrees. 


What is it’s orientation now? How would you get there in one rotation instead? What does 
this suggest about the multiplication of rotation matrices? 


. What happens if you first rotate it by 60 degrees, and then by 30 degrees? What does this 
suggest about the commutative property of 2D rotation matrices? 


Exercise 4.6 


This is an algebra problem involving the multiplication of 2D rotation matrices. 
1. Use some algebra to show that 2D rotation matrices commute, ie. R(61)R(02) = R(02)R(01). 


2. Use some algebra to show that R(@,)R(@2) = R(@1 + 42). You will need to look up some trig 
identities. 


Exercise 4.7 


Now, consider a rectangle of width 2 and height 4, centered at the origin. For clarity, this means that 
the corners of the rectangle have coordinates (1, 2), (—1,2), (—1, —2), and (1, —2). 


1. Plot these four points by hand and connect them with lines to complete the rectangle. 


2. Now, using the appropriate rotation matrix, transform each of the corner points by a rotation 
through 30 degrees counterclockwise (recall that the sin and cos of 30 degrees can be expressed 
exactly). Compute and plot the resulting points by hand and connect them with lines. Does 
the resulting figure look like you’d expect? 


Exercise 4.8 
Now, let’s do it in MATLAB. 


1. Create and plot the original 4 points:(1, 2), (—1, 2), (—1, —2), and (1, —2). Then create the 
matrix that rotates them by 30 degrees counterclockwise, transform each of the four original 
points using the rotation matrix, and plot the resulting points. Does this look right? Reminder: 
plot (1,2, ‘x’ ) puts a mark at the point (1,2). Matlab: the functions cos and sin expect 
radians, while cosd and sind expect degrees. 


45 
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2. Operating on individual points with the rotation matrix is cool, but we can be much more 
efficient by operating on all 4 points at the same time. Write down the matrix whose columns 
represent the four corners of the rectangle. Then write down the matrix multiplication 
problem we can solve to transform the rectangle from above all at once. Create these matrices 
in MATLAB to perform the rotation in a single operation. Plot the resulting matrix to confirm 
your transformation! Some MATLAB tips: plot (X, Y) creates a line plot of the values in 
the vector Y versus those in the vector X. So if you wanted to plot a line from the origin (0,0) to 
the point (1,2), you would do this: plot([o 1],[o 2]). The command axis([-xlim xlim 
-ylim ylim] ) sets the axes of the current plot to run from -xlim to xlim and from 
-ylimto ylim 


. What is the area of the rectangle before and after the rotation? 


. What matrix should you use to undo this rotation? Define it in MATLAB and check. 


. Show that the product of this matrix (which undoes the rotation) with the original rotation 
matrix is the identity matrix. For clarity, let’s give this matrix the symbol R™!. It is the matrix 
that undoes or inverts the original operation and is known as the inverse of the matrix R. 
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Solution 4.2 


. Scalars, vectors, and matrices are examples of arrays. A o-dimensional array can be thought of 
as a scalar. A 1-dimensional array is a vector. A 2-dimensional array is a matrix. 


. The matrices have to be the same size and addition is element-wise. 


. The matrices have to compatible (inner dimensions agree), and the output is dictated by the 
outer dimensions, i.e. (n x m)(r xX 8) = (n X 8). 


. Distributive property: A(B + C) = AB+ AC 
. Associative property: A(BC) = (AB)C 


. Commutative property: Two matrices commute if AB = BA but this is not always true. 


Solution 4.3 


. Using the distributive property you can see that (A + B)? = (A+B)(A+B) = A?+ AB+ 
BA + B?. In general AB # BA so no further simplification is possible. Since real numbers 
always commute the result is the more familiar (x + y)? = x? + 2ry 4+ y?. 





. If you plug D and D? into the equation you should find that the result is a zero matrix. 


. You need to show that A7A = AA? using already established properties, ie. A?A = 
(AA)(A) = A(AA) = AA?. 


Solution 4.4 


. The matrix exponential is defined by the power series exspA = I+ A+ = +.... No- 


2 
tice that this A is diagonal and A? = i 


>| and the exponential becomes exp A = 


Ga ae 
14+242?/2!+... 0 
0 14+3432/214.. iP If you have seen power series before then you 
; ‘ _ fexp2 0 
will recognise that exp A = | 0 ees | . 


a b 
d 


equal to zero and find constraints on the entries a, b, c, d. 


. You can define a general two by two matrix A = i find A2, set each of the entries 


. You need to show that A"A = AA” for any square matrix A and any positive integer n by 
induction. First you show it is true for n = 1 and n = 2. Then assume it is true for some 
n = k, and prove that it must be true for n = k + 1. You use the fact that A commutes with 


itself and the associative property, ie A7A = (AA)A = A(AA) = AA?. 


Solution 4.5 
. Okay, I placed my book on the table. 


. You could undo the rotation by rotating clockwise by 30 degrees. You could think about this 
as a counterclockwise rotation of -30 degrees. 


. You could get there by rotating once by 90 degrees. This suggests that the product of two 
rotation matrices of angles 0, and 02 is a rotation matrix of 0; + 02, ie. R(0,)R(02) = 
R(6, + 62). 
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4. You end up in the same orientation so it doesn’t matter the order. This suggests that the order 
of multiplication doesn’t matter so that two rotation matrices must commute. 


Solution 4.6 


1. You could multiply out two rotation matrices with angle 6; and 6 in the two different orders and 
you will observe that the output is the same because real numbers commute, i.e. cos 6; cos 02 = 
cos 92 cos 61. 


2. If you multiply two matrices together you will get the following expression in the first row 
and first column, cos 4; cos 62 — sin 6; sin #2. You will find a trig identity which reduces this 
to cos(@, + 02). Similar reductions take place for the other elements. 


Solution 4.7 


1. The rectangle is 


>< 




















2. The rotation matrix is 
R = [©0830 —sin30] _ | 3] 
sin30 cos 30 _ v3 


Applying this to each point, we get 























2 2 
1 =v3—2 1 =v3+2 
r[>|- =14v3 .R[7]- =1-V3 | 


And the rotated figure looks like, 


Solution 4.8 


1. There are lots of ways to do this point by point. Here is an example of how to transform the 
bottom right point: 


>> BR = [13-2] 

>> plot(BR(1,:),BR(2,:),'b*') 

>> rotmatrix = [cosd(30) -sind(30); sind(30) cosd(30) ] 
>> nBR = rotmatrix*BR 

>> plot (nBR(1,:),nBR(2,:),'r*') 
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2. There are lots of ways to do this. Here is an example where we include the first point twice so 
that the points can easily be connected with lines: 


>> pts = [1 -1 -1 1 13;2 2 -2 -2 2] 

>> npts = rotmatrix*pts 

>> plot(pts(1,:),pts(2,:),'b'), hold on 
>> plot(pts(1,:),pts(2,:),'r') 

>> axis([-3 3 -3 3]) 

>> axis equal 


3. The area of the rectangle is the same before and after rotation: 8 square units. 


4. To undo this rotation you could simply rotate it by 30 degrees clockwise, using the matrix 


Ro! = | ©°8 30 sin 30 
~ |—sin30  cos30] ° 


5. The product of R~! and R is 


R'R= | cos 0 a ee me 


cos? 6 + sin? 6 0 Py ae 
—sin@ cos@| |sin@  cosé - 


0 cos? 6 + sin? 6 0 1 


where we have used the trig identity cos? 6 + sin? 6 = 1. 
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Schedule 
5-1 3D Rotations [45 minutes]............ 002 ce eee eee ee ee ee 46 
5.2 Reflection and Shearing [30 minutes] .... 0... 0... 00. c eee eee eee 47 
5-3 Matrix Summary [15 minutes] ... 2... ee ee ee 48 





5-1 3D Rotations [45 minutes] 


We can extend the idea of 2D rotations to 3D rotations. The simplest approach is to think of 3D rotations as 
a composition of rotations about different axes. First let’s define the rotation matrices for counterclockwise 
rotations of angle @ about the x, y and z axes respectively. 


1 0 0 
R,— |9 cos@ —sin@ (5.1) 
0 sin@  cosé 


cos? 0. sind 
R, = 0 1 0 (5.2) 
—sin@ 0 cosé 


cos@ —sind 0 
R,=]sin@ cosd 0 (5.3) 
0 0 1 


For example, to first rotate a vector v counterclockwise by @ about the x axis followed by counterclockwise 
by ¢ about the z axis, you need to do the following 


cos@ —sing O} }1 0 0 
sind cosd 0 0 cosOé —sind| y (5.4) 
0 0 1] }O sin?  cosé 


We will next look at some sequence of physical rotations and relate them to these rotation matrices. 


Exercise 5.1 


Hold a closed book in front of you, with the top of the book towards the ceiling (+z = (0,0, 1) 
direction) and the cover of the book pointed towards you (+a = (1,0, 0) direction), which leaves the 
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opening side of the book pointing towards your right (+y = (0, 1,0)) and the spine toward the left. 


1. Rotate the book by 90 degrees counter-clockwise about the x-axis, then from this position, 
rotate the book by 90 degrees counter-clockwise about the z-axis. Which direction is the cover 
of the book facing now? 


. Return to the starting position. Now rotate the book by 90 degrees counter-clockwise about 
the z axis, and then from this position, rotate the book by 90 degrees counter-clockwise about 
the @ axis. Which direction is the cover of the book facing now? Is it the same as in part a? 


. An operation "commutes" if changing the order of operation doesn’t change the result. Do 3D 
rotations commute? 


. The cover of the book is originally pointed towards (1, 0,0). Multiply this vector with the 
appropriate sequence of rotation matrices from above to reproduce your motions from part 1. 
Do you end up with the correct final cover direction? 


. Multiply the (1, 0,0) vector with the appropriate sequence of rotation matrices to reproduce 
the motions from part 2. Do you end up with the correct final cover direction? 


. Multiply the result of the previous part by the appropriate sequence of rotation matrices to 
return to the original (1,0, 0) vector. 


. From either of your answers to part 4 or part 5, try, instead of operating on the (1, 0,0) vector 
sequentially with one rotation matrix and then the other, take the product of the two rotation 
matrices first, and then multiply (1, 0,0) with the resultant matrix. Does this reproduce your 
answer? 


. Based on your answers to the previous parts, show that (R.R,)* =e Heal nigga 
general property of matrix inverses — it works for all square, invertible matrices, not just 
rotation matrices! 





5.2 Reflection and Shearing [30 minutes] 


In this activity we will meet reflection and shearing matrices, which will allow us to explore transformation 
matrices in general. 


Reflection 


Exercise 5.2 


What do the following reflection matrices do? Think about it first, draw some sketches and then test 
your hypothesis in MATLAB using the rectangle with vertices (0,0), (2,0), (2,1), and (0, 1). How 
much does the area of your basic rectangle change, if at all? What is the inverse of each? 


fo 
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sin20 —cos20 


bs 26 ~— sin 20 | 





Shearing 


Exercise 5.3 


What do the following shearing matrices do? Think about it first, draw some sketches and then test 
your hypothesis in MATLAB with the rectangle with vertices (0,0), (2,0), (2,1), and (0,1). How 
much does the area of your basic rectangle change, if at all? What is the inverse of each? 


ake 





5-3 Matrix Summary [15 minutes] 


+ Matrices are holders of data, e.g. temperature data, coordinates of points, etc 
« Matrices are transformation operators, e.g. rotation, reflection, shearing. 


+ Matrices have algebraic properties: 


-A+B=B+A 
- (AB)C = A(BC) 
- AB # BA (don’t always commute) 


« 2D Rotation Matrix (does commute) 


cos@ —sin@ 
sin@ cos 


R(0) = | 
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« 3D Rotation Matrices (do not commute about different axes) 


» Order of operations: ABv implies that B acts on v, and then A acts on the result. Alternatively, 
compute the product AB, and use it to act on v. 


¢ Inverse Matrix: undoes a transformation 


-AA=I 
- (AB)-! =B-!A7} 


Exercise 5.4 


Read through the matrix summary, and discuss concepts that you are still confused by. 
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Solution 5.1 
1. The cover is now facing toward the +y axis (the positive part of the y axis). 
2. The cover is now facing the +2 axis. This is different than in part a. 
3. Since the answers for the first two parts are different, 3D rotations do not commute. 


4. Let v be the vector that represents the initial direction of the cover of the book, 


1 
v= |0 
0 


Rotation by 90 degrees counterclockwise around the « axis is given by 


1 0 0 
R,= 0 0 -1 
01 £40 
so that the new vector becomes 
1 
R,v = |0 
0 


Rotation by 90 degrees counterclockwise around the z axis is given by 


0 -1 O 
R.=]1 0 0O 
0 Oo 1 
so that the new vector becomes 
0 
R.R,v = }1 
0 
which is the correct final direction. 
5. Using the matrices from above, 
0 
R,R.v = |0 
1 


6. To rotate 90 degrees clockwise around the x axis we use the matrix 


Oo 


Then we can return the vector (0, 0, 1) to its original position (1,0,0) by 


R,'R;' 


e- Oe 
| 
oor 
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7. We can multiply the rotation matrices together and perform a single matrix multiplication. 


For part d, the relevant matrix product is 


and we see that 


as expected. 
8. We can see from the previous parts that 
(Run = RAR, 


In other words, when you take the inverse, the order of operations must swap! 


Solution 5.2 


1. This matrix reflects everything over the y-axis. In the figure below, the original blue rectangle 
becomes the orange rectangle. The area of the rectangle stays the same. 








1.57 


0.5 + 























x 
Figure 5.1: Reflection over y-axis. 


2. This matrix reflects everything over the x-axis. In the figure below, the original blue rectangle 
becomes the orange rectangle. The area of the rectangle stays the same. 
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-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 


Figure 5.2: Reflection over x-axis. 
3. For example, let 9 = 30 degrees. Then the rectangle is reflected along the line that is 30 degrees 


counterclockwise from the x-axis. In the figure below, the original blue rectangle becomes the 
orange rectangle. 


2.5 1 1 1 1 1 1 1 1 1 





0.5 + 

















Figure 5.3: Reflection over 30 degree line. 
Notice that, if we plug in 6 = 90, we get the matrix from part 1, which reflects over the x-axis 


(ie., 90 degree line) and, if we plug in 0 = 0, we get the matrix from part 2, which reflects over 
the y-axis (i.e., the o degree line). 


Solution 5.3 
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1. This shearing matrix pulls the points along horizontal lines and the strength of the pull is 
proportional to the y coordinate. In the figure below, the blue rectangle is sheared to become 
the orange rectangle: 





2.5 





1.57 


0.5} i 

















0.5 1 1 ! \ 1 ! 
-0.5 0 0.5 1 1.5 2 2.5 3 
x 


Figure 5.4: Shearing in x direction. 


The area of the rectangle does not change. The inverse is 


1 -l 
O 1° 
2. This shearing matrix pulls the points along vertical lines and the strength of the pull is 


proportional to the x coordinate. In the figure below, the blue rectangle is sheared to become 
the orange rectangle: 





2.5 + 





1.5/7 


0.5 + 














L 





0.5 1 ! 1 ! f 
-0.5 0 0.5 1 1.5 2 2.5 3 


x 


Figure 5.5: Shearing in y direction. 
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The area of the rectangle does not change. The inverse is 


1 0 
-1 1|° 
3. This shearing matrix pulls the points along horizontal lines and the strength of the pull is 


proportional to the y coordinate and the constant k (the bigger the k, the stronger the pull). In 
the figure below, with k = 2, the blue rectangle is sheared to become the orange rectangle: 

















Figure 5.6: Shearing in x direction with k = 2. 
The area of the rectangle does not change. The inverse is 
ba] 
O 1° 
4. This shearing matrix pulls the points along vertical lines and the strength of the pull is 


proportional to the x coordinate and the constant k (the bigger the k, the stronger the pull). In 
the figure below, with k = 2, the blue rectangle is sheared to become the orange rectangle: 
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Figure 5.7: Shearing in y direction with k = 2. 


The area of the rectangle does not change. The inverse is 


“+ a 
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Homework 2: Matrix Operations 
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Overview and Orientation 


? Learning Objectives 
Concepts 
* Compute the determinant of a 2 x 2 matrix 
« Know the relationship between the determinant of a matrix and whether the matrix is invertible 
- Find the inverse of a 2 x 2 matrix by hand 
+ Use computational tools to find the inverse of an n x n matrix 


« Design a 2 or 3-dimensional matrix that will scale a vector by given amounts in the x, y or z 
direction 


+ Design a 3-dimensional matrix that will translate a 2-D vector by given amounts in x and y 
MATLAB skills 
» Represent a set of points in 2-D space (ie., pairs of x, y values) as column vectors 


« Transform a set of 2-D points (ie., the outline of a shape) using a matrix to rotate and translate 
the original 


« Multiply matrices and find their inverses 


+ Compute the determinant of a matrix 
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6.1 Determinant of a Matrix 


The determinant of a square matrix is a property of the matrix which indicates many important things, 
including whether a matrix is invertible or not. We will see more of this when we see matrix inverses shortly. 
The determinant of a matrix G is denoted a few different ways. 


det(G) = |G| (6.1) 


Consider a generic 2 x 2 matrix G: 


The formula for the determinant of a 2 x 2 matrix is quite straightforward: 
det(G) = ad — be (6.2) 


For example, for the following 2 x 2 matrix, 


= (1)(4) — (2)(3) = -2 (6.3) 


Exercise 6.1 


Return to the transformation matrices in the day assignment and calculate the determinant for the 
following: 


1. The generic 2 x 2 rotation matrix 


cos@ —sin@ 
sin@ cos 


2. The matrix which reflects over the y axis (we'll be meeting the reflection matrix in Week 
2b, so if you haven’t seen this yet, you can skip this problem for now (and return to 


it later) or just turn the crank using the formula (treating this matrix as you would 


any arbitrary 2x2 matrix)). 
=i @ 
O @ 
. The matrix which shears in the horizontal direction (we'll be meeting the shearing matrix 
in Week 2b, so if you haven’t seen this yet, you can skip this problem for now (and 


return to it later) or just turn the crank using the formula (treating this matrix as 
you would any arbitrary 2x2 matrix)). 

I “il 

@ fl 


a) 


Exercise 6.2 
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1. What do the following matrices do? Think about it first, draw some sketches and then test 
your hypothesis in MATLAB. How much does the area of your basic rectangle change, if at all? 


(a) 


Ta 
ho 


2. Is it possible to “undo” the matrices above? Why or why not? 


(b) 


Exercise 6.3 
1. What are the determinants of the two matrices from the previous exercise, Exercise 6.2? 


2. Generalizing from Exercise 6.1 and Exercise 6.2, what’s the relationship between the determi- 
nant of a matrix and the result of transforming a rectangle by that matrix? 


eee) 


Finding the determinant of an n x n matrix, where n > 2, is a bit more computationally intensive. If 





you want to learn how to do the procedure by hand, check out this Khan Academy video. For this course, 
we simply recommend you use the det function in MATLAB. 


6.2 Matrix Inverses 


Inverse of 2 x 2 Matrices 


In class you worked with rotation matrices and transformations that were compositions of simpler rotations, 
and you learned how to invert them. When you multiply a vector by any matrix (not just ones that are 
associated with simple spatial transformations), you transform the original vector into a new vector. More 
generally (than rotations), you can often undo the linear transformation (just like you did with the rotation 
matrix). Undoing this linear transformation is a linear transformation itself! Therefore the act of undoing a 
linear transformation can be formulated with a matrix multiply. 


Exercise 6.4 


Consider the following matrices and vector. (Don’t try to interpret these as intuitive geometrical 
operations; we’re just using them to explore the determinant.) Work out the following problems in 
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MATLAB. 


. Find w = Pu. 


. Find Qw. How is this related to u? 
. Find QP. Does the answer look familiar? 
. Find PQ. 


. Find the determinant of P. In MATLAB, you can compute the determinant of any (not just 
2 x 2) matrix using the det function. 


. Find the determinant of Q. 





A matrix B is said to be the inverse of the matrix A if, and only if, BA = I and AB = I, where [is the 
identity matrix. For 2 x 2 matrices, the inverse (if it exists) is given by the following 


a b 
e=|% i (6.7) 


Gy : | y i (6.8) 


~ad—be|—c a 





The last equation should indicate to you that the inverse of the matrix G~! is only defined if ad — bc £ 0. 
Sweet mother of linear algebra, ad — bc is our buddy the determinant. More generally, any square matrix 
can be inverted if and only if its determinant is non-zero. 

Now let’s practice calculating inverses, some of their properties, and how we may use them. 


Exercise 6.5 


All matrices A and B which have inverses have the following properties 
(AB)"'=B ‘A 
(Aaya ata 
1. Using the above properties, please compute the following by hand. 
(a) If 


find (PB) ~'. Recall that you already know the inverse of P from earlier. 
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(b) For P as defined above, find 


(Poe: (6.12) 


2. Use the inverse formula to calculate the inverses for the first three matrices in Exercise 6.1. 
Confirm your answers by multiplying the inverse with the original matrix. 


CO) 


Note that solving matrix-vector equations like above can be done without explicitly computing the matrix 
inverse which is computationally expensive. (A nod to our future friend, left matrix divide or backslash 


divide.) 





Inverse of n x n Matrices 


For higher-dimensional matrices, e.g. n x n matrices for n > 2, the matrix inverse is defined in the same 
way. Suppose you have an n x n matrix A and an n x n matrix B. Then B is the inverse of A if and only if 
BA = Iand AB =I. The following are some properties of inverses of matrices 


« Only square matrices are invertible, i.e., only square matrices have inverses. 


+ A matrix has an inverse only if its determinant is non-zero. 


There are a number of different procedures to compute the inverse of higher-dimensional matrices, but 
we will not be going into the details of their computation here. You can look them up if you are interested, 
or need to in the future. In MATLAB, you can compute the inverse of a matrix using the inv function. 


Exercise 6.6 


1. Suppose that you have an unknown number of apples, oranges and pears in your fridge. 
Suppose that each apple costs $1, each orange costs $2, and each pear costs $3. Assume also 
that the weight of every apple is 3 oz, every orange is 4 oz, and every pear is 3 oz. Additionally, 
suppose that the total weight of the fruits is 45 oz, and you paid a total of $21 for the fruit. 


(a) If possible find the numbers of oranges, apples and pears. If not, please explain why. 


(b) Suppose that you additionally know that you have a total of 14 fruits. Can you formulate 
and solve a matrix-vector equation to find out the numbers of oranges, apples and pears 
you have? 


(c) What is the determinant of the matrix you have set up to solve this? 


. The fruit vendors bought the pricing algorithm from Uber. Oranges are still $2, pears are 
now only $1.50, and (due to an influx of teachers) apples are now surging at $1.50 each. Their 
weights stay the same. You return to the market, and again purchase 14 fruits, which have the 
same total weight and total cost. 


(a) Can you formulate and solve a matrix-vector equation to find out the numbers of oranges, 
apples and pears you have? 


(b) What is the determinant of the matrix you have set up to solve this? 
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6.3. Transformation Matrices, Continued 


Scaling 


Returning to two dimensions. In the Night 1 assignment, you also learned about scaling matrices. Recall that 
the scaling matrix S scales the x-component by s; and the y-component by s2 


_ S1 0 
s=|% a 


Let’s assume for the moment that s; = 2 and sy = 1/3. Working with the rectangles defined in class whose 
corners have coordinates (1,2), (1, -2), (—1, 2), and (—1, —2) complete the following activities: 


Exercise 6.7 


. Predict what would happen if you operate on the rectangle with S. 
. Write a MATLAB script to carry out this operation and check your prediction. 
. How does the area of the rectangle change? 


. What matrix should you use to undo this scaling? Show that the product of this matrix with 
the original scaling matrix is the identity matrix. 


. Define it in MATLAB and check. Again, this is the inverse matrix and we give it the symbol 
Sore 
. In MATLAB, change the value of s2 to 1 and find the product of the new S and your rectangle. 


How does the area of the rectangle change? Change the value of s2 back to 1/3. 


. Predict what would happen if you operate on the original rectangle with SR, where R is the 
rotation matrix. How about RS? Implement both of these in MATLAB and check. 


. How would you undo each of these operations (SR and RS)? How is the inverse of the 
product related to the individual inverses, i.e. what is the relationship between (SR)~! and 


S~! and R~!? What about (RS)~!? 





Translation 


It would be really useful if, in addition to scaling and rotating our objects, we could translate them. Let’s start 
by thinking about vectors and then we will figure out how to represent translation as a matrix operation. 
Consider an initial vector v and a translation vector t. The new translated vector is simply v + t. For 


A then the new 


example, if you start with the initial vector v = * and translate it using the vector | 


vector is just 








ae ; ‘ . |b ; xt+t 
3|° More generally, if the translation vector is | | then the new vector will be ‘ s : 
v y T by 
Wouldn’t it be handy if we could define translation as a matrix operation? Yes, indeed it would be, we 
hear you say. Here is the standard method: add another entry to the original vector, and set it equal to 1, ie., 
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x 
v = | y|. Now define the translation matrix as 
1 
1 0 ty 
T=|0 1 4 
0 0 1 


Exercise 6.8 


. Show that Tv accomplishes the process of translation (if you ignore the third entry in the new 
vector). What is the final vector? 


. Predict what would happen if you operate on our old friend the rectangle with the translation 
matrix defined by t; = 2 and t, = 3. 


. Write a MATLAB script to carry out this operation and check your prediction. How has the 
area of your rectangle changed? 


. What matrix should you use to undo this translation? Show on paper that the product of this 
matrix with the original translation matrix is the identity matrix. Define it in MATLAB and 
check. Again, this is the inverse matrix and we give it the symbol T~!. 


. Choose a rotation matrix R. Predict what would happen if you operate on the original rectangle 
with TR. How about RT? Implement both of these in MATLAB and check. How would you 
undo each of these operations? (You will first have to adjust your definition of R so that it is 
the correct size.) 


. Predict what would happen if you operate on the original rectangle with STR. How about 
TRS? How would you undo each of these operations? (You will first have to adjust your 
definition of S so that it is the correct size.) 


. How would you generalize translation to 3D? 





Putting it all together: Dancing Animals 


In this activity you will animate a circus act. (No real or imaginary animals will be injured in this performance.) 
Here is what we would like you to do: 


Exercise 6.9 
. Decide on an animal. 


. Decide on a circus act that consists of a set of translations, rotations (think back to Day 2), 
shearings, and/or scalings in some order. Storyboard this idea and imagine the resulting 
animation. 


. Propose a set of points that defines the outline and relevant features of your animal. You may 
find ginput useful. Define the points in MATLAB and plot your animal. 


. Create a script that makes your animal dance (in 2-D, unless you really want to go 3-D). You 
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may want to make use of the pause and drawnow commands. 


5. Now use your sequence of operations and animate your animal! In class you will have the 


opportunity to show off your dancing animal! 





6.4 Conceptual Quiz 


1. The orange shape is the result of applying a matrix M to the blue rectangle. 





2.5 T T T T T 








0.5- 











What is the determinant of M? 


2. The orange shape is the result of applying a matrix M to the blue rectangle. 
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What is the determinant of M? 


3. The determinant is multiplicative, i.e, det(AB) = det(A) det(B). Let M be a matrix such that 


det(M) = 4. What’s det(M~')? (Hint: det(I) = 1.) 


4. Let R be a rectangle with area 1. Apply the scaling matrix S = i | . What is the area of S.R? 
2 
A a 
Bi 
Cc $152 
D 81, + S82 


5. True or false: Any shearing matrix S and any rotation matrix R commute, ie., RS = SR. 
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Solution 6.1 
1. The determinant is 1. (Recall that cos? 6 + sin? 6 = 1.) 
2. The determinant is -1. 


3. The determinant is 1. 


Solution 6.2 


1. Each of the figures below shows the basic blue rectangle and the orange rectangle, which is 
the result of applying the transformation. 





2.5} il | 








> 1.5 if 7 





























2. It is not possible to undo these matrix transformations. Since everything is squished onto the 
same line, we would not be able to distinguish the original vectors. 


CHAPTER 6. HOMEWORK 2: MATRIX OPERATIONS 


Noti 


words, the matrix looks like B 


ce that, in the above matrices, the first row is a constant multiple of the second row. In other 
b ; : 
es for some constant c. If we apply a matrix of this form to a 


z 
Cc 


point in 2D space represented by the vector Bi , then the result will be | , where z = ax + by. In 


other words, the resulting point will always fall on the line y = cz. 


4 
5 


Solution 6.3 


aet([F j]) =@a)- aay =o 
aet([F 5] ) = @@)- 4) =0 


. The determinant appears to tell us how the area of a rectangle is changes as a result of applying 
the transform. Each of the matrices that had a 0 determinant transformed rectangles to line 
segments (which have 0 area), whereas matrices with determinants of 1 or —1 didn’t change 
the area of the transformed rectangles. The matrix with determinant of —1 (reflection) seems 
to also be telling us something about how the sides of the rectangle change relative to each 
other. 


Solution 6.4 


w= Pu= i 
Qw = QPu = Hl 
1 O 
i fF 4 


which is the identity matrix 
. The determinate of P is 2. 


. The determinate of Q is 3. 


Solution 6.5 


pp [P78 


(a) 


(b) 


oR 


cos? —sin@]\~ cos sin@ 

sin@O  cos@ —sin@ cosé 
cos@ —sin@| [ cos@  sin@] [| cos?@+(-—sin@)? — cos@sin@ — sin# cos 6 
sind cos —sin@ cos6|  |sin@cos@—cos@sin#@ sin? 6 + cos? 6 


i 


I 
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(0 i]) 


lI 
SS 
o | 

e 
ee Oo 
SS 








2 i i (6.13) 


Solution 6.6 


1. (a) It’s not possible to find the numbers of oranges, apples, and pears. We have the equation 
an oe _ [2a 
4 3 3 “| [45]? 
Np 


but we cannot take the inverse of a 2 x 3 (non-square) matrix. 


(b) Now we have the equation 


2 1 3] [no 21 
4 3 3| Ing] = }45 
1 1d) Imp 14 


So by taking the inverse of the 3 x 3 matrix we find that np = 3,ng = 9 and np = 2. 


(c) The determinant of the matrix is 2. 


2. (a) The equation becomes 


3 3 
4 3 3] |na}| = }45 
11 1{|n,| [14 


But the matrix is not invertible, so we cannot solve for the number of fruit. 


(b) The determinant of the matrix is o. 


Solution 6.7 


1. The length of the rectangle would double in the x direction and be reduced to 1/3 the length in 
the y direction. 


2. First we define the corners of the rectangle as the columns in a matrix 
» points=[1 1 -1 -1; 2 -2 -2 2] 
and we define the scaling matrix 
» S=[2 0; 0 1/3]. Then we simply multiply them 
» scaledpoint=S* points. 
Plotting them, here is the original rectangle in blue and the scaled rectangle in orange 
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2.5 T T T T T T T T T 





1.57 a! 























3. The area is reduced from 8 units” to 5.33 units”, or 2/3 of the original area. 


4. To undo the process we use the inverse of the S matrix, or S~' would be used. 


_,_ [05 0 
5 ae Ar 


You should check that S~1S = SS7! =1. 


5. We define the inverse matrix » Sinv=[0.5 0; 0 3] andcheck that » S*Sinv and 
Sinv*S both produce the identity matrix. 


6. The area of the rectangle doubles. 
7. When the original rectangle is operated on with 


SR 


, the resulting image will be a horizontally stretched parallelogram. When the original rectangle 
is operated on with RS, the resulting image will be the scaled rectangle from the previous 
exercise only rotated 60 degrees counter-clockwise. 


8. (SR)-'=R'S~! or(RS)-1=S"'R7! 


Solution 6.8 
HP t+ty 
Y) = |ytty 
1 1 


8 





Ld 


1 0f 
14/0 1 ¢ 
0 0 


Hb 


2. The rectangle would be moved 2 to the right and 3 up. 
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3. 


4. 


The area of the rectangle does not change. 
1 0 -2 
TC =]0 1 -3 
00 1 


. If the original rectangle is operated on by TR, the rectangle would first be rotated with respect 


to the origin and than translated. If the original rectangle is operated on by TR, the rectangle 
would first be translated and then rotated. As rotation happens with respect to the origin, the 
2 operations will not result in the same rectangle. 


To undo the operation TR, the resulting figure should be operated on by R~!T™!. To undo 
the operation RT, the resulting figure should be operated on by T-!R7!. 


. If the original rectangle is operated on with STR, the resulting image will be of the rectangle 


rotated 60 degrees around the origin, translated 2 to the right and 3 up and then scaled by 
S. If the original rectangle is operated on with TRS, the resulting image will be the scaled 
rectangle rotated 60 degrees around the origin and then translated 2 to the right and 3 up. 


To undo STR, the resulting figure should be operated on by R~'T~'S~!. To undo TRS, 
the resulting figure should be operated on by S-'R7!T7?. 


8 


XN 


ooorF 
ooro® 
oOroo 
oh Sh Se 
Pre 8 
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7.1 Debrief and Dancing Animal Demos [30 mins] 


+ Please discuss your overnight work with your breakout-room mates, create a set of key concepts, and 
a set of ideas that you are still confused by. 


« Be prepared to demo your dancing animal to your breakout room. 


7.2 Synthesis [20 mins] 


Exercise 7.1 
You should do all of these. 


1. Assume the matrix D represents a geometrical object. What is the correct matrix expression if 
we want to rotate it first (R), then scale it (S), and finally translate (T) it? 


A. DRST 
B. TSRD 
C.RSTD 
D. DTSR 


2. What would be the correct expression in order to undo the transformation in the previous 
problem? 


3. A and B are square, invertible matrices of the same size. Which of the following are always 
true (no matter the entries in A and B? 
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A. (AB)? = BT AT 
.(AB)-! =B-1A7! 
3 (AS A= = (aoe 
D. det(AB) = det(A) det(B) 
EA+B=B+A 


F. AB=BA 

G. det(AB) = det(A) + det(B) 
H. (AB)? = ATB? 

L (AB)-! = A~'B-! 





7.3 Mini Lecture Linear Independence, Span, Basis [20 mins] 


7.4 Linear Independence [20 mins] 


A set of non-zero vectors is linearly independent if it is not possible to scale and sum them to make the all 
zeros vector, except when the scale factors are all zero. 

If 3-dimensional vectors x1, X2X3 are linearly independent, it means that it is not possible to find scale 
factors C1, C2, C3 so that 


CyX1 + C2X2 + ¢c3x3 = 0 (7.1) 


except when C1, C2, C3 are all zero. 

This property also implies that if you have n linearly independent, n-dimensional vectors, you can 
express any other n-dimensional vector by scaling and summing those linearly independent vectors. 

If 3-dimensional vectors x), X2X3 are linearly independent, it means that for any 3-dimensional vector 
Xq, it is possible to find scale factors dj, dz, d3 so that 


dx, + dyX2 + d3x3 = Xq. (7.2) 





Exercise 7.2 


1. Determine which of the following sets of vectors are linearly independent. 


iy aah (Re 
(a) |o|, |1], Jo 
Ol WOih ke 
(ie Mie a0. 
(Byesleat tl ealal 
o| jo} fo 
iE peal je 
() {2|, {1}, |4 
3 3 


0 


(d) p, q, rand s, where the vectors are all 3-dimensional. 
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2. Consider two column vectors 


(7.3) 


Both these vectors lie on the wy-plane since their z components are zero. Define a new vector 


a3 = C1 a, +C2ae, where c, and ce are arbitrary variables. Therefore ag is a linear combination 
of a; and ag. 


(a) Does ag also lie on the xy-plane? 


(b) Next, define a 3 x 3 matrix A whose columns are a1, ag and a3. Show that the product 
of A and any 3 x 1 vector always lies on the ry-plane. 
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Solution 7.2 
1. (a) They are linearly independent since they span R®. 


(b) They are linearly dependent since the first vector is equal to the second vector plus two 
times the third vector. 


(c) They are linearly dependent since the third vector is equal to the first vector plus two 
times the second vector. 


(d) They are linearly dependent. You can have a maximum of n linearly independent vectors 
in R”. 
(e) They are linearly independent since they do not lie on the same line. 
2. (a) Yes, a linear combination of two vectors which lie in the ry-plane will also lie in the 
xy-plane. 


(b) Let A be the matrix 


1 1 Cy + C2 
A= ]1 2 cy + 2c2 
0 0 0 
and let v be an arbitrary 3 x 1 vector 
x 
v= /Y 
z 


Then the product 
xotyt(cr+ce2)z 
Av = |a+ 2y 4+ (c1 + 2c2)z 








lies in the xy-plane 
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Week 3b: Linear Independence, Span, 
Basis 
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In the last class, you were introduced to the idea of linear independence, span and basis. Today, we 
will dig deeper into these ideas. We will start with some review of the main ideas from the previous class, 
followed by an introduction of some new ideas. 


8.1 Linear Independence and Span [30 mins] 


Exercise 8.1 
1. In words, describe what it means for a set of 3 x 1 vectors to be linearly independent. 


2. What is the maximum number of vectors in a set of 3 x 1 vectors that are linearly independent? 


Exercise 8.2 


Next, let’s do a problem similar to what you saw in last class in MATLAB. Consider the following 
matrix: 


(ale: 
B=|1 2 4 (8.1) 
ie ale 


The third column of this matrix equals the second column plus twice the first column. Hence these 
three vectors lie on some plane (not the xy-plane as in the previous part). 


1. Open up MATLAB and using the quiver3 command together with hold on, please plot 
the vectors corresponding to the three columns of B. Note that typing » quiver3(0,0,0, 
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1,1,1); in MATLAB, plots an arrow from the origin to the point (1,1,1), ie. it plots the 
vector corresponding to the first column of B. Typing » hold on in MATLAB results in 
subsequent calls to quiver appearing on the same axes, without erasing previous arrows. 


. Using the "rotate 3D" function on the MATLAB figure window, rotate the figure around so that 
it appears as if all three arrows overlap. This should indicate that the vectors lie on a plane. 


3. Using det compute the determinant of matrix B. Does this make sense? 





CO 


The fundamental property here is that the columns of the A and B matrices are not linearly independent. 
We shall next define the idea of linearly independent vectors more formally, for an arbitrary number of 
dimensions (recall that we went through this for 3 dimensional vectors in the last class). 


A finite set S = {x1, X2,...,Xm} of vectors in R” is said to be linearly dependent if there exist scalars 
C1, C2,+++,Cm Which are not all zero, such that 


CyX1 + CeX2 +... +CmXm = 0. 


Note that R” here refers to the set of all n-dimensional vectors that are made up of real numbers. (For 
example, R! is the real line and R? is the plane.) For any value of n, R” is an example of a vector space 
- we will meet different examples of vector spaces in the future. We can also express this equation 
using a matrix A, whose columns are x), X2,°°*Xm.- 


[x1 XQ... | : | =O. (8.2) 


If a non-zero solution exists to Ac = O then the set of vectors x1, X2,..., Xm is linearly dependent. 
In the case of a square matrix (n = m), the vectors x1, X2,- ++ Xm are linearly dependent if and only if 
the det(A.) = 0. Otherwise, the only way to satisfy the equation above is if cy = cz =--- =Cm = 0. 
Figure 8.1 illustrates two examples of three vectors that are in 3D space, but are linearly dependent, 
since in each case, all three vectors are on a plane. 








rn 


Figure 8.1: Linearly dependent vectors in R?. (from Wikimedia Commons). 








The set of vectors X1,X2,..., Xm is linearly independent if it is not linearly dependent. In other words, 
the set of vectors x1, X2,..., Xm is linearly independent if 

CyX1 + C2KQ +...+€mXm = 0 (8.3) 
only when c; = co =-+: = Cm = 0. In other words, if the only solution to Ac = 0 is c = O, the set 


of vectors made up of the columns of A is linearly independent. For a square matrix this means the 


set is linearly independent if and only if det(A) # 0. 
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Figure 8.2: Linearly independent vectors in R°. (from Wikimedia Commons). 


» The span of S is the set of all linear combinations of its vectors. In other words, the span of the set S 
is the set of all possible vectors of the form 


CyX, +CoXo +... +CmXm 
The span is usually denoted by span(x1, X2,..-,Xm)- 


- A finite set S = {x,,X2,...,Xm} of vectors is said to form a basis of a vector space V, if the vectors 
in S are linearly independent, and every point in V can be expressed as a linear combination of the 
vectors in the set S. Hence, if a set of vectors S is linearly independent those vectors form a basis of 
the set which is the span of those vectors. 


Let’s solidify our understanding of linear dependence, bases and span by working on a few problems by 


hand. 


Exercise 8.3 


1. In words, describe the span of the vectors Hl and 2] é 
1 2 1 
2. In words, describe the span of the vectors | 1], |3] and |—1] which are all in 3-dimensional 


0} |0 0 
Euclidean space. 
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8.2 Orthogonality [30 mins] 





Figure 8.3: Projection 


By trigonometry, if we have two vectors v; and v2 which have an angle of # between them, the component 
of v2 which lies along the direction of v is |v2| cos 0. Since the dot product of the two vectors can be 
expressed as |v,||v2| cos 0, this component (referred to as the projection) can be written as v1 - v2/|v1|. If 
the projection is zero, the vectors are orthogonal, and vj - v2 = 0. If the vectors are unit length, in addition 
to being normal, the vectors are said to be orthonormal. Additionally, if a basis set is made up of orthonormal 
vectors, it is known as an orthonormal basis. 

A square matrix with columns of unit vectors which are orthogonal to each other is known as an 
orthogonal matrix. An orthogonal marix A has the property that A? = A7!. 


Exercise 8.4 


Which of the following pairs of vectors are orthogonal or orthonormal? 





8.3. Decomposition [30 mins] 


Suppose we have a set (collection) of m basis vectors {v;} which are normalized (|v;| = 1), mutually 
orthogonal (v/v; = 0 unless i = j) and span our space (every point can be written as some linear 
combination of the vectors {v; }). How do we actually find the linear combination which is equal to a given 
vector in our space? 
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Let’s say we have a vector w which we are interested in expressing as a linear combination of our set of 
orthonormal vectors {v;}. We can write this linear combination as 


m 
w= se CiV; (8.4) 
i=1 


and our problem is now to find the coefficients c; in this expression. 
The obvious option is to pack the basis vectors v; into the columns of a matrix A, and find solutions of 


Ac=w 


Since the columns of A are formed from basis vectors they are linearly independent and a non-zero solution 
exists and can be determined by the usual methods. 

However, our basis vectors form an orthogonal set (collection) which permits a more direct calculation. 
Consider a particular vector v; in our basis set, and let’s take the dot product between v;, and our vector w: 


m 
T T 
V,W=V;, Ss GVi (8.5) 
i=1 
Distributing the dot product into the summation we have: 
m 
Viw = ye CVE Vi (8.6) 
i=1 
But from orthogonality we know that the dot product of any two different vectors in our orthonormal set is 
zero, so all terms in the sum where k i are zero. This leads to the following simplification 
viw a ChYE VE (8.7) 
In addition, since our set of vectors is normalized, we know that Vive = 1, leaving us with 
VLW = Ch (8.8) 


This gives us a very nice, simple way of decomposing a vector into a linear combination of the vectors within 
our basis set. The dot product of each basis vector with our target vector will result in the coefficient of that 
term in the linear decomposition. 


Exercise 8.5 


1. There are many (in general, an infinite number) of bases for a given set V. Hence, we can 
describe elements in the set V as linear combinations of vectors from different bases. Consider 
the following two basis sets which form bases for 2-dimensional space. 


“vVi= a 7 = | and 


il 


2 
il 


V2 


2) ; Ware ‘ ; 
3 as a linear combination of the first basis set (i.e., a sum of scaled 


versions of each vector in the basis set). Repeat for the second. Please make two different 


Express the vector w = 


drawings of | ,|, one expressed as a sum of scaled vectors in the first basis set and another for 


3 


the vectors from the second basis set. Please label the lengths of each vector in the set. 
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1 
2. Suppose that you wish to write the vector w = | 2] as a linear combination of the vectors 
4 
1 3 1 
vi = {1}, vo= 1] andv3 = |2 
1 2 2 


Please write a matrix equation to find the coefficients of the linear combination, and solve for 
the coefficients using MATLAB if possible. 


3. Representing vectors using different bases is a very powerful technique that we will keep 
coming back to in this class (in both semesters). Vectors described in different bases can 
give us insight that may not be so obvious when viewed in the original basis. Representing 
vectors in different bases can also be used for dimensionality reduction, which is an important 
technique that is used to speed up computations and compress data in a number of different 
fields. Here we will consider a problem of lossy data compression using a change of basis. 
Lossy compression refers to methods of representing data more efficiently, but with a loss of 
accuracy. Examples of lossy data compression include jpg images, and mp3 audio files. If care 
is taken in lossy compression, the effects of the data loss can be kept at acceptable levels (this 
is of course subjective and dependent on the application). We will start with a toy example 
and then move to more complicated ones in subsequent homework problems. Consider a set 
of four 2-dimensional data variables stored in the following vectors: 


as = [F9]22= [oo] = [3] = [oa a 


(a) In MATLAB, plot the data using points (without lines connecting them) by typing 
plot([2.2 11.5 1.7],[1.2 0.6 0.7 0.8], ’0’); You will find that 
these points lie close to the line through the origin with slope 1/2. 


(b) Define a unit vector that points in the direction and call it u;. Find another unit 


2 

1 
vector that is orthogonal to u, and call it uz. These vectors form a basis in 2 dimensional 
space. 


(c) Rather than storing the original data, we are now going to express the original data in 
terms of the new basis that we have defined. To do that, write d,, dz, d3 and dy, asa 
linear combination of u; and uz. You can use MATLAB here to find the coefficients. 


(d) In this toy example, we are going to "compress" our data by only keeping the coefficients 
corresponding to uj. ie. we will discard the coefficient corresponding to up. Suppose 
that we wish to recover approximations to dj, dg, d3, d4, from the four coefficients. 
These approximations, which you should denote by dj,--- dy, are all scaled versions of 
u;. In your axes from part a, please plot the points corresponding to di,---d4. Do you 
think they make good approximations? 


(e) We can describe how well our compressed data represents our original data. One way to 
do this is to calculate the difference between our original and compressed data, and call 
this error vector f; = d; — d;. Now, compute the size of this error using norm/(f;) for 
i = 1,2,3,4. Then, summarize the error by finding the root-mean-square (RMS) error 
between your approximations and the true data points. The RMS function squares the 
errors, takes the mean, and then takes the square root. This quantity is a single number 
that can be used to measure how well or poorly your compressed data represents your 
original data. You may find MATLAB’s norm and rms functions helpful here. 
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This toy example illustrates that we can sometime be more efficient (albeit at the cost of some 
accuracy) in representing (or computing) data when it is expressed in certain bases. 
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Solution 8.1 


. A set of vectors is linearly independent if it is not possible to scale and sum these vectors to 
result in the all-zeros vector, except if all the scale factors are zero. 


. You can have at most 3 vectors in a set of linearly independent 3 x 1 vectors 


Solution 8.2 


. Type the following into MATLAB: 
» quiver3(0,0,0,1,1,1) 
» hold on 
» quiver3(0,0,0,1,2,1) 
» quiver3(0,0,0,3,4,3) 


. The determinant of B is zero. Recall that a matrix is not invertible if and only if the determinant 
is zero. This matrix is not invertible since it collapses all vectors to a plane. 


Solution 8.3 


. The span of these two vectors is all over R?,ie,a plane. 


. The span of these three vectors is the xy-plane in R?. 


Solution 8.4 


. The dot product of these two vectors is non-zero, so they are not orthogonal. 


. The dot product of these two vectors is zero, so they are orthogonal. 


. The dot product of these two vectors is zero, so they are orthogonal. Furthermore, each vector 


is unit length, so they are orthonormal. 


Solution 8.5 


. It’s clear that 2v; + 3v2 = w. We visualize this as 


To write w as a linear combination of the basis vectors u, and uz requires a bit more work. 


We can set up the matrix equation 


| 


5 





Ww 
3V2 





HS]+ 
= 


V2 


2v1 


AEl-k 


and solve to learn that sau + Gyur = w. We can visualize this as 


Fa 
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2. 


3. 





First, we create a matrix in MATLAB whose columns are the vectors vj, vo, and v3, 

» V=[1 3 1; 11 2; 1 2 2] 

and the vector w, 

» w=[1; 2; 4]. 

Let c be the vector of coefficients. We have the equation Vc = w, so to solve for c we compute 
c = V—!w. In MATLAB, we use » inv(V) *w. This tells us that w = —10v, + 2v2 + 5v3. 


(a) 
(b) We define» u1=[2; 1] and» u2=[-1; 2]. Thereare other choices for ug, but 
they are all constant multiples of this choice, e.g.,» u2=[-2;4]. 


(c) Create a 2 x 2 matrix with u, and ug as the columns, 
» U=[2 -1; 1 2] 
and a 2 x 4 matrix the vectors d; as the columns 
» D=[2.2 11.5 1.7; 1.2 0.6 0.7 0.8]. 
Then compute 
» inv(U)*D 
to get the matrix of coefficients. This tells us that 


dy = 1.12u, + 0.04u2, do = 0.52u4 + 0.04uz, 


d3 = 0.74u, — 0.02u2, and dy = 0.84u, = 0.02u2. 
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? Learning Objectives 


Concepts 


+ Determine for a system of 3 or fewer unknowns whether it has a unique solution, no solution 
or infinite solutions. 


* Create a set of linear equations from a narrative about how the unknown variables are related 
to given data. 


« Represent a system of linear equations with matrix, vector notation 
« Solve a linear system of equations 

MATLAB skills 
« Compute the determinant of a matrix 


« Solve systems of linear equations of the form Ax = b using all three methods: inverse matrix, 
linsolve, or backslash operator. 
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9.1 Linear Independence and Bases 


Exercise 9.1 


. Is this set of vectors linearly independent? 


3 
. Please express the vector |6] as a linear combination of the vectors 


4 


. Show that the following vectors form an orthogonal bases for R4 


Nile 


| NIRNIE 


NIRNIF 


NIBNIF 





NI BDI EN EN|H 


LS) 


. Suppose that 


=a (9.3) 
pat 1 
2 2 


1 
i 
fi 
i 
2 


Please find c;, c2, c3 and cy. As an aside, these vectors form what is called a Walsh code (which 
can be expanded to higher dimensions). Walsh codes are used in wireless communications so 
signals from multiple users can be added together (e.g. at the antenna of a cell tower), and 
then separated by different mobiles. 


. Construct a 4 x 4 matrix A whose columns are the vectors from the previous part. Show that 
C1, C2, €3, c4 can be found by solving 





9.2 Determinants and Invertibility 


You have already encountered the determinant in class: the determinant of a square matrix is a property of 
the matrix which among other things indicates whether a matrix is invertible or not: if the determinant of a 
square matrix is zero, it is non-invertible. As a reminder: 
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The determinant of a matrix G is denoted a few different ways. 


det(G) = |G| (9.9) 


a b 
e=[2 a. 


the formula for the determinant is quite straightforward: 


For a generic 2 x 2 matrix G 


det(G) = ad — be (9.10) 
For example, for the following 2 x 2 matrix, 
1 2 di, 2 
we([5 al) 4 
= (1)(4) — (2)(3) = -2 (9.11) 


You already considered the determinant of some transformation matrices, now let’s consider what the 
determinant is really telling us about a general matrix. 


Exercise 9.2 


A= ie ia 
Yi 2 
We can think of the columns of A as two vectors beginning at the origin and ending at the 
points (x1, y1) and (2, y2), respectively. These vectors form a parallelogram, as shown here: 


1. Let A bea 2 x 2 matrix 


y 


(X2, ¥2) 


(X1, 91) 
Hb 


Show that the magnitude (i.e., absolute value) of det(A.) is equal to the area of a parallelogram 
formed by the column vectors of the matrix A. 


2. What is the determinant of A if its column vectors are on the same line? Graphically, what 
happens to the parallelogram? 





From this, you should get the feeling for the fact that the determinant is a measure of how co-linear 
the columns of A are: or in other words, how linearly independent the two columns are. The determinant 
therefore lets us know quickly if a linear system of algebraic equations has a solution, as illustrated in the 
following example. 


Exercise 9.3 


Consider the following matrix whose columns lie on the same line: the second column is simply 
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twice the first column. 


1. What is det(A)? 


2. Find all the solutions to Ax = 0. 


3. For which vectors b does Ax = b have a solution? Why are there only certain b vectors that 
lead to solutions to Ax = b? 





While the formula for the determinant of a 2 x 2 matrix is quite straightforward, the procedures for 
computing the determinant of larger matrices is more difficult, but they are well known and well documented. 
Fortunately, MATLAB has the det function which computes the determinant. 


9.3 Linear Systems of Algebraic Equations: Formulation and Defi- 
nition 


In previous classes, you’ve encountered a bunch of exercises where you had to operate on a vector to find 
another vector: 


Ax =b, (9.13) 


where A and x were known, and your job was to find b. While this is fun and, as you saw above in the 
rectangle exercise, can be useful, there is another related problem which is easily as important. It involves 
the same equation, but now you know A and b and need to find the vector x. As we will discuss here, this 
problem captures the concept of a Linear System of Algebraic Equations. 

One key idea in building models is the step of abstraction: going from some real-world situation to an 
abstracted model for the system (e.g., a set of equations). There are two important aspects of building such 
a model: first, deciding what to include or ignore, and second, deciding how to mathematically represent 
those things you choose to include. 

One particularly common kind of mathematical framing is a set of linear algebraic equations, which 
can be represented by a matrix equation. A general system of m linear algebraic equations in n unknown 














variables 71, 2%2,...,2n, takes the form 
44401 + 4272 +443%3+...+GinTn = Oy 
42121 + G22%2 + 93%3+...+G2n%n = by 
Am1XL1 + Am2%2 + Am3%3+...+AamntIn = bm 
where @11,012,---,@mn are known as coefficients and 61, b2, b3,..., Dm are constants. We can write this 


using matrices and vectors in the form 


Ax=b 


where A is the m x n coefficient matrix, x is the n x 1 unknown vector, and b is am x 1 constant vector 
which is known. In other words, 


XY by 
Q11 a12  *'* Gin bo 


A= : : me : << iS b= 


Gm1 Gm2 °*** Amn 
bmn 
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Note that “linear” here means linear in terms of the unknown variables, e.g., if x is an unknown there are 
only terms like ax, and no terms like sin(x), x”, 1/2, etc. It is often the case that you might have coefficients 
that appear to be non-linear; for example, in solving physics problems, you might have coefficients that 
depended on trig functions of angles, such as (cos #) F’,, which is is linear in F’, but not linear in 0). Be 
careful to be clear about what you're solving for when you decide whether something is linear or non-linear. 


9.4 Using Matrix Inverses to Solve Linear Systems 


Over the last couple of weeks, you have worked with rotation matrices, and transformations that were 
compositions of simpler rotations, and learned how to invert them. When you multiply a vector by any 
matrix (not just ones that are associated with simple spatial transformations), you transform the original 
vector x into a new vector b. 


Ax=b 


More generally (than rotations), you can often undo the linear transformation (just like you did with the 
rotation matrix). Undoing this linear transformation is a linear transformation itself! Therefore the act of 
undoing a linear transformation can be formulated with a matrix multiply. 


A~'Ax=A™'b 
=x=A™'b 
This reduces our linear system of algebraic equations problem to the problem of finding the inverse of 
our matrix A. Note this is only possible if A is square and invertible. 
When solving a system of equations, at least half of the battle is typically getting your system abstracted 


to the point that it can be thought of as a system of linear equations. The following are a set of problems. 
You don’t need to solve these problems — you just need to formulate them as linear algebra problems. 


An Investment Example 


In this section we will focus on deciding whether and how you can abstract the system to a mathematical 
model that can be written as a matrix equation. 


Exercise 9.4 


Suppose that the following table describes the stock holdings of three of the QEA instructors. Also 
suppose that on a given day the value of the Apple, IBM and General Mill’s stock are $100, $50 and 
$20 respectively. 





Apple | IBM | General Mills 
Paul 100 100 100 
Siddhartan 100 200 0 
John 50 50 200 























1. Here’s your first linear algebra formulation question: What is the total value of the holdings for 
each professor on the day in question? Can you formulate this as a matrix expression? If so, 
what is it? If not, why not? 


. Now, suppose that you do not know how many shares of each stock are owned by the instructors. 
However, you know that the total value of the stocks for each instructor for three consecutive 
days is as given in the following table 
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Paul | Siddhartan | John 
Day 1 | $1500 $2600 $950 
Day 2 | $1600 $2810 $1020 
Day 3 | $1400 $2550 $1000 























You also know that the price of each stock on each of the three days was as follows: 





Apple | IBM | General Mills 
Day1 | $100 | $50 $20 
Day 2] $110 | $50 $22 
Day 3 | $100 | $40 $30 























Now here’s the second formulation question: how many stocks of each company does each pro- 
fessor own? Can you formulate this as a matrix equation? If so, what are the matrices/vectors? 
If not, why not? 


CO 


Let’s now look at an example involving flows. In this case, we are looking at traffic flows on streets, 
and in and out of junctions. Similar ideas are applicable in many situations, including fluid flow in pipes, 
currents flowing in circuits, data packets in computer networks, etc. 





Exercise 9.5 


A portion of the roads in Gotham, which are organized in a grid of one way streets, is illustrated in 


the following figure 


—>fs 


| 


fa 


i fe<— 


| 


i; 


Let the average traffic flow on the roads and into/out of the intersections be denoted by 1, fo,--- , fio 
in units of cars per minute. Suppose that in the green future, few people drive cars and fz = 1, fs = 
3, fe = 3, fr = 1, fo = 2. Assume that the total traffic flow into each intersection equals the total 
traffic flow out of the intersection, i.e. there is no buildup of vehicles in any intersection. 


1. Set up a system of linear equations in matrix vector form which describes the relationships 
between all the flows. 


2. With the information given, are you able to solve for the unknown flows? If so, please write a 
matrix-vector equation to solve for the remaining flows and solve for them using MATLAB. 
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3. Suppose that we know that f3 = 2? Please write a matrix-vector equation to solve for the 
remaining flows, and solve for them using MATLAB. 


4. (Bonus question worth zero points): If the arrows above represent "one way" traffic signs, 
which one(s) is/are pointing in the wrong direction(s)? 


aaa 


9.5 Types of Linear Systems and Types of Solutions 





Consider the linear system of algebraic equations expressed in matrix-vector form as, 
Ax = b. 


If b = O the system of linear algebraic equations is homogeneous and if b £ 0 the system is non-homogeneous. 
As mentioned before, we’ve already dealt with systems like this before when we were transforming geomet- 
rical objects, but in that case we already knew x and we were simply multiplying by A in order to get b. 
Here, we are considering the so-called inverse problem, and trying to find x given A and b. However, let’s 
back up and consider some small examples to explore the solution possibilities a little. 


Elimination of Variables 


In high school you probably learned some basic techniques for solving small linear systems of algebraic 
equations. Consider the following linear system of algebraic equations, 


2%, +3%2 = 6 (9.29) 
44, +922 = 15 (9.30) 





The basic technique, called elimination of Variables, proceeds as follows: First, solve equation (2) for 71 


3 
ry = 3 = 9g (9.31) 


Now substitute this expression for 71 into equation (3) 
3 

A(3 = 322) + 9x2 =15 
Now we simplify this equation 

12-— 6x2 + 9x9 = 15 

=>3r. = 3 

and solve for x2 to give 72 = 1. Now we substitute this solution back into equation (2) or (4) to determine 
t= 3. The original linear system of algebraic equations therefore has a unique solution, x = | y | ; 


3/2 


However, not all linear systems of algebraic equations have a unique solution. For example, the system 


#y + 2a, =1 (9.32) 
2a, + dao = 2 (9.33) 
has an infinite number of solutions because equation (6) is just a multiple of equation (5). Solving equation 


(5) for x1 gives 
Ly = 1-— 222 
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and choosing an arbitrary value of x2 = a gives 


rz, = 1-2a 


tw. = a 


x= | +a oe 
~ 10 1 
This defines an infinite number of solutions since a is any real number. What do you notice about each part 


of this vector? 
It’s also possible that a linear system of algebraic equations has no solution. For example, the system 


or in vector form 


41+ 2a =1 (9.34) 
221 + 4a =1 (9.35) 


has no solution. Solving equation (8) for x2 gives 


and replacing into equation (7) gives 


which on simplification gives 


which hopefully we all agree is incorrect. We assumed that there was a solution, performed elimination and 
substitution and found a statement that contradicts our assumption: no solution therefore exists. 


Exercise 9.6 


1. Using the technique of elimination of variables described above, determine which values of h 
and k result in the following system of linear algebraic equations having (a) no solution, (b) a 
unique solution, and (c) infinitely many solutions? 


t1+hxe 
221 + 3% 


. Using the technique of elimination of variables described above, determine whether the 
following linear systems of algebraic equations have zero, one, or infinitely many solutions. If 
solution(s) exist, determine the actual solution(s). 


(a) 
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Solving a linear system of algebraic equations in MATLAB 


Exercise 9.7 


In the last class, you worked with an example of fruits in your refrigerator, and we asked you 
questions like how to calculate the total weight of the fruits, how many fruits there are, etc. We 
can use matrix operations to calculate inverse problems as well, as this question illustrates. Suppose 
that you know that you have apples and oranges in the fridge and that in the genetically engineered 
future, the weights of all apples are 30z and all oranges are 40z. Because of inflation in this genetically 
engineered future, the price of each apple is $1 and the price of each orange is $2. Suppose that you 
also know that you paid $13 total for your fruit and the total weight of the fruit is 33 oz. We can use 
this information and tools we have developed to figure out how many apples and oranges we have. 
Let n, and nq be the numbers of oranges and apples in your fridge respectively, and that you don’t 
know what these numbers are. Define the following vectors 


(9.36) 
(9.37) 


1. Write an equation relating n and d, using a matrix-vector product. 
2. Calculate how many oranges and apples you have. 


3. Why this kind of problem is often called an inverse problem? 


Exercise 9.8 


1. Consider the example with the fruits that you worked out earlier. Now, in addition to apples 
and oranges, suppose you also had an unknown number of pears which each weigh 3 oz, and 
cost $3. Additionally, suppose that the total weight of the fruits is 45 oz, and you paid a total 
of $21 for the fruit. 


(a) If possible find the numbers of oranges, apples and pears. If not, please explain why. 


(b) Suppose that you additionally know that you have a total of 14 fruits. Can you formulate 
and solve a matrix-vector equation to find out the numbers of oranges, apples and pears 
you have? 
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(c) What is the determinant of the matrix you have set up to solve this? 


2. The fruit vendors bought the pricing algorithm from Uber. Oranges are still $2, pears are 
now only $1.50, and (due to an influx of teachers) apples are now surging at $1.50 each. Their 
weights stay the same. You return to the market, and again purchase 14 fruits, which have the 
same total weight and total cost. 


(a) Can you formulate and solve a matrix-vector equation to find out the numbers of oranges, 
apples and pears you have? 


(b) What is the determinant of the matrix you have set up to solve this? 


. Recall the example with fruits from class: Suppose that you have a total number of 14 apples, 
oranges and pears in your fridge. Suppose that each apple costs $1, each orange costs $ 2 and 
each pear costs $3. Assume also that the weights of every apple is 3 oz, every orange is 4 0z 
and every pear is 3 oz. Additionally, suppose that the total weight of the fruits is 45 oz, and 
you paid a total of $21 for the fruit. 


(a) Formulate (or look up your formulation from class) and write down (but don’t solve it 
yet) a matrix-vector equation to find out the numbers of oranges, apples and pears you 
have. 


(b) Solve this equation to find the numbers of apples, oranges and pears using the following 
approaches (they will of course give you the same results, but we want you to get familiar 
with using the different operations here). 

i. Using MATLAB, compute the inverse of the matrix in part a and use it to find the 
numbers of apples, oranges and pears. 

ii. Use MATLAB’s 1insolve function to find the numbers of apples, oranges and 
pears. 


iii, Use MATLAB’s \ operator to find the numbers of apples, oranges and pears. 


————————— a) 


9.6 Conceptual Quiz 





1. Select the matrices which are invertible. 


a 
of] 
of 
() ; 4 


lt Ae 
(e) Bs rd 
v2 V2 


2. How many solutions does the following system of equations have? 


zt+y=9 


CHAPTER 9. HOMEWORK 3: INDEPENDENCE, SPAN, BASES, AND LINEAR SYSTEMS OF ALGEBRAIC EQUATIONS97 


e-z=2 
y+z2=7 
A. Zero 
B. One 
C. Two 


D. Infinitely many 
3. What is the area of a parallelogram whose vertices are (0,0), (2, 4), (5, 1) and (7,5)? 


4. Solve the following system of linear equations 


Z-y=2 
3@a+2=11 
y—-2z2=-3 


What is the value of y? 
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Solution 9.1 


1. No, because 


1 1 4 0 
Stale te) = |p G5) 
1 0 3 0 
2. 
3 1 0 1 
he ep lt 1a (60) 
4 0 1 1 
3. 
1 
1 1 1 1 i 
[ot eat Za 3 =0 
= 
ri 
2 
fs 3 2 2] |_?| =9 
2 
2 
1 
aT 
oe aff 
4 
1 
a 
[oe ot oe ell ate 
i? 
2 
1 
2 
le a = 5 aff 
a 
1 
2 
ee > 
4 


Note that when you expand out eacn product of the row and column vectors above, you get 
a sum of two ; terms and two —+ terms, resulting in zero. Therefore, the vectors are all 


orthogonal to each other. 


4. We have already proven that the vectors in this question are mutually orthogonal in the 
previous part. Additionally, note that they are all of unit length, and they span R*, because 
they are 4 mutually orthogonal, 4-dimensional vectors. To find ¢1, c2, ¢3,c4, we can do the 
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following: 


5 
—2 
a= 4 2) =2 
1 
5 
—2 
a= 4-2 -H|B) = 
1 
5 
—2 
aff -} -$ A/G) = 
1 
5 
fi, ek a | | 
ca = [3 2 2 2 ol (9.7) 
1 


5. Let’s write 


—11 
| 


Niele 
Nie 
Nile 

_—_—_! 

ae | 
ie) 
a 

—__—! 


oo 
NI RN Rb] Hb| 
|| 
Niele 
Nie | 

Ni Ry|FR 
enre 
Nir Nie 
| 
cc 
oo oO 
Co ee) 
[ 

| 

c——— 
em 


0 | (9.8) 


If we expand out the equation above according to the rules of matrix-vector multiplication, we 
get (9.3). Therefore, by solving it, we can find cy, co, c3 and c4 


Solution 9.2 





3. The determinant is equal to o, or det(A)=0. 


Solution 9.3 
1. det(A)=(1)(4)-(2)(2)=0 


2. There are infinitely many solutions of the form —x, = 272. 


k : 
| where k is a constant. 


3. Solutions are of the form b = e 
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Solution 9.4 


1. This can be formulated as Ax = b where 


100 100 100 100 ae 
A={|100 200 0|,x=| 50], andb= |d,, 
50 50 200 20 dis 


Doing the matrix multiplication shows that Paul has dp, = 17000, Siddhartan has dgg = 20000, 
and John has dj, = 11500. 


2. There are several ways to do this. Perhaps the simplest is to compute each person’s stock 
holding individually. To do this, we let A be a matrix with the stock prices 


100 50 20 
A= {110 50 22), 
100 40 30 


let b,, be a vector representing the value of Paul’s stocks on each day, 


1500 
bp = | 1600] , 
1400 


and let x,,; be a vector representing Paul’s stock holdings (i.e., the first entry tells us how 
many stocks of Apple he has, the second entry is IBM, and the third is General Mills). This 
gives the equation Ax), = bpr. By inverting A we can solve for xp,. Then we repeat this 
procedure for each of the other instructors. 


But... we can do it quicker! Form a 3 x 3 matrix X whose columns are made the vectors 
Xpr,Xsqg,and Xj,. Then form a 3 x 3 matrix B whose columns are made of the vectors bp, bsg, 
and b,,. This gives the equation AX = B. Inverting A, we can solve for X: 








Paul | Siddhartan | John 
Apple 10 20 5 
IBM 10 10 5 
General Mills 0) 5 10 




















Solution 9.5 


1. Let’s sum all the flows into each junction. We therefore have 








fies ie i) (9.14) 
f3 = fat fs (9.15) 
fr+fs=fat fo (9.16) 
fo = fio + fs (9.17) 
Substituting the known information and rearranging gives us 
his fee (9.18) 
fa jes (9.19) 
fg—fa=2 (9.20) 
fs+ fi0 = 2 (9.21) 
So we can then write a matrix vector equation 
a Oe te |e -1 
1 Sy oo Fe 3 
00 tt 0.) A= (9.22) 
Os. Ws a ca 2 
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2. You cannot solve for the variables since an 4 x 5 matrix is not invertible. 


3. In this case, the equations become 


eat oe (9.23) 
f4s=-1 (9.24) 
fg Ja 2 (9.25) 
fs + fio = 2 (9.26) 
So we can then write a matrix vector equation 
1 0 O -1} | fi 1 
0 1 0 O fa| — |-1 
OS Sts Se 20a) aa ee (9.27) 
0 0 1 1 fio 2 


We can check the determinant of the matrix in the equation above in MATLAB, to find that it 
equals 1. Therefore, we can solve for f1, f4, fg, fio by inverting the matrix as follows 


ft fh 0 at 1 
fa 0. a Or 20 = 
Fab Oe Sie WG 2 (9.28) 
fio 0 0 1 1 2 


The MATLAB code to solve this probem is: 


» A= [1 00 -1 ; 010 0; 0 -1 10; 00 11] % setup the 
matrix 


» det(A) % check determinant 
» b = [1; -1; 2; 2] % setup vector of known values 


» inv(A)*b % solve for the unknowns 


4. The arrow for f, is pointing the wrong way as the solution above has f, as a negative number. 


Solution 9.6 


1. Rearrange the equations to linear form y = ma +b. If the lines are identical, there are infinitely 
many solutions; if the lines are parallel, but don’t overlap, there are zero solutions; if the lines 
are not parallel, there is one solution. 


(a) h=3/2, k#2, 

(b) h43/2, 

(c) h=3/2, k=2 
4 


2. (a) c= ]2 
0 


(b) No Solution 


(c) Infinite Solutions 


Solution 9.7 


= [eal d= [es 


CHAPTER 9. HOMEWORK 3: INDEPENDENCE, SPAN, BASES, AND LINEAR SYSTEMS OF ALGEBRAIC EQUATIONS 102 


+ bel = El 


3. In this case we know the result b, and are working backwards to find the number of apples 
and oranges. We also use a matrix inverse to find the result. 


Solution 9.8 
1. (a) No, you have three unknowns and only two equations. 


(b) Yes, you now have three equations and three unknowns. 


2 1 3) [no 21 

4 3 3] |ng|] = 145 

1 1 1] {np 14 
No 3 
Na| = 19 
Np 2 

(c) det(A) = 2 
2. (a) 


2 1.50 1.50] [no 21 
4 3 8 | |ng} = |45 
ae en ee 14 


This forumation cannot be solved because A is not invertible. 


(b) det(A) =0 
3. (a) 


(b) i. 


iii. 
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10.1 Debrief [15 mins] 


+ Please discuss your homework with your breakout room, and resolve any issues with your peers 
and/or an instructor. 


10.2 Synthesis [55 mins] 


We will increasingly use a computational tool like MATLAB to compute determinants, matrix inverses, and 
the solutions to linear systems of algebraic equations. In this synthesis section we will explore the theoretical 
foundation of these algorithms - the so-called LU decomposition. 


Gaussian Elimination 


The basic process of elimination of variables can be formalized and is known as Gaussian Elimination. Here 
will briefly introduce it but you can consult other sources on the internet. 

Rather than writing equations, we can cast a linear system of algebraic equations in matrix form and 
perform Gaussian Elimination on the augmented matrix [A b]. 

For example, the linear systems of algebraic equations 


24, +3%2 = 6 
404 + 9x9 =. 15 





can be written as the following augmented matrix 


2 3 6 
4 9 15 
Thinking now in terms of rows, we replace the second row with row 2 - 2 x row 1 to give 
2 3 6 
0 3 3 


103 
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This matrix is now in so-called echelon form: we can find the solution to the original linear system of 
algebraic equations by first solving the equation implied by the last row and then back-substituting into the 
equation implied by the previous row. The equation corresponding to the second row is 


322 =3 
which has solution x2 = 1. Replacing into the equation corresponding to the first row we find 
2%, +3=6 


which has solution 7; = 3/2. 


Exercise 10.1 


1. Set up the augmented matrix for the following example (you will recognise this from the last 
assignment) 


Dis Ate = 13} 
4x1 + 3x2 33 


and perform Gaussian Elimination to reduce the augmented matrix to echelon form. Interpret 
the resulting system and determine the solution(s). 





LU Decomposition 


The steps used to solve a linear system of algebraic equations using Gaussian Elimination can also be used 
to decompose a matrix into a product of two matrices: a lower-triangular matrix L and an upper-triangular 
matrix U. Here we will briefly introduce it but you could consult other sources on the internet. 

In Gaussian Elimination we execute a set of row operations. In our ongoing example, we replaced row 2 
with the result of row 2 - 2 x row 1. This action can be neatly represented in terms of a matrix operation. 
Let’s multiply the original matrix equation Ax = b with the transformation matrix 


1 0 
m=[2 | 
to form MAx = Mb. Note that this transformation leaves row 1 of A unchanged, and it replaces the row 2 
with row 2 - 2 X row 1. The product MA is an upper-triangular matrix U 


2 3 
v=[o 
and the linear system of algebraic equations is now expressed as Ux = Mb. If we now multiply this 


expression by M~! we obtain 


M~'Ux=b 
The inverse of M is straight-forward to write down because it "undoes" the row operations 


eit noe de 20 
m= Dt 


Notice that this matrix is just a lower-triangular matrix L. The linear system of algebraic equations now 
reads 


LUx=b 
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We have therefore decomposed the original matrix A into the product of L and U, 
A=LU 


How does this help, you might be asking? First of all, knowing the decomposition of A into LU allows us 
to solve the original linear system of algebraic equations Ax = b. Here is how. 

Let’s define a new vector y = Ux. Then the original lienar system of algebraic equations can be 
expressed as 


Ly =b 


which is straight-forward to solve by forward-substitution because L is lower-triangular, 


1 0 Yi} 6 
and the solution for y is y1 = 6, ye = 3. We can now solve Ux = y for x using backward-substitution 


because U is upper-triangular, 
2 3 TY) 6 
0 3] |v} {3 


and the solution for x is 7, = 1, v2 = 3/2. 

Second of all, and more importantly, knowing the decomposition of A into LU allows us to solve any 
linear system of algebraic equations involving A. Need to solve the linear system of algebraic equations 
with a different b? No problem, just use the LU decomposition that you already computed and away you 
go. No need to redo all the steps of Gaussian Elimination just because b changed. Need to solve a linear 
system of algebraic equations for lots of different b’s? No problem, just use the LU decomposition that you 
already computed and away you go. Finally, if you want to compute the inverse or determinant of a matrix 
this is easy too using LU decomposition as we show next. 

There is an algorithm in MATLAB, lu, which does LU decomposition for you, but you should not 
necessarily expect to get the same L and U, even for this example. (There are a variety of ways to define the 
L and U matrices, but this is beyond the scope of this section.) 


Exercise 10.2 


1. Consider the appropriate matrix from the last exercise and perform LU Decomposition. Check 


your answer by confirming that A = LU. (Please note that you perform LU decomposition 
on the original matrix A, not the augmented matrix.) 





Determinant 


The basic algorithm for computing a determinant of A is to first perform LU decomposition, and make use 
of the following property: 


The determinant of an upper-triangular or lower-triangular matrix is just the product of the 
diagonal entries. 


We already met another property of determinants, namely that the determinant of a product is just the 
product of the determinants. Therefore, det(A) = det(L)det(U), each of which is just the product of the 
diagonal entries. 


Exercise 10.3 


1. Consider the appropriate matrix from the last exercise and find the determinant using the LU 
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decomposition previously determined. Check your answer using det in MATLAB. 





Inverse 


The basic algorithm for computing the inverse of A is to first perform LU decomposition, and make use of 
the following idea. B is the inverse of A if it satisfies the following property 


AB=I 


The columns of B are just the solutions of a linear system of algebraic equations with a different b. For 
example, in the two by two case we can solve 


and then 
af 


and if we fill the columns of B with the solution to these linear system of algebraic equations we will have 
constructed the inverse. Since we already have the LU decomposition of A we simply solve each case using 
the technique already presented. 

For example, the first column of B is determined as follows: First we solve Ly = b 


Blog 


to give y; = 1 and yo = —2. Now we solve Ux = y 


2 3} ja1) | 1 
0 3 v2 ~ —2 
and the solution for x is 71 = 3/2, x2 = —2/3. This is the entries in the first column of the inverse. 


Repeating this process for b = Ei will give the second column of the inverse which now reads 


1 


Av = ee ar 


Exercise 10.4 


1. Consider the appropriate matrix from the previous exercise and find the inverse using the LU 
decomposition previously determined. Check your answer using inv in MATLAB. 





10.3. Applications of Linear Systems of Algebraic Equations [20 
mins ] 


Chemical Analysis 
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Exercise 10.5 


The complete combustion of propane, C’3.Hg, with oxygen, Oz yields carbon dioxide, C'O2, and water, 
20. Based on conservation of mass, this reaction can be written as 


a(C3 Hg) + b(O2) = c(CO2) aP d(H20) 


Determine the coefficients in the combustion equation. Note that you will need to learn how to 
"balance" a chemical reaction. 
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11.1 Applications of LSAE: GPS Positioning [40 mins] 
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Exercise 11.1 


Consider a simplified model of a Global-Positioning System (GPS) where we use Cartesian coordinates 
to represent points on Earth with the origin being the center of the earth. The units of measurement 
are in earth radii, e.g. (0, 0, 1) is the North Pole. Suppose that signals from three satellites are received 
at a point on earth (e.g. the dot off the coast of West Africa in the picture, but the answer is not 
necessarily this) . Let the coordinates of this point be (x, y, z). Suppose that all 3 satellites transmit 
their signals at time zero, but the signals arrive at the receiver at different times due to the different 
distances between the satellites and the receiver. These signals propagate at the speed of light. Let 
the coordinates of each satellite and the time its signal was received be given in the table below. 


o 


Coordinates of Satellite | Signal arrival time (ms) 
(1, 2, 0) 28.3 
(0, 1, 2) 40.7 
(a, Oo, 2) 41.1 











1. Please find the coordinates of the point (x, y, z) by formulating and solving a linear system 
of algebraic equations. You should make any reasonable assumptions, e.g. the point is on 
the surface of the earth. We suggest expressing the speed of light in terms of earth radii per 
millisecond, and using MATLAB to numerically compute the final solution. 


2. Please describe what additional information (if any) you would need to solve this problem if 
the signals from the satellites were sent simultaneously, at some unknown time t instead of 
time zero? 


Note that this problem is simplified and ignores a number of practical considerations such as having a 
common reference time, noise, weak signals, and time dilation. However, the basic idea of positioning 
by using different arrival times of signals from different satellites is what is at the heart of GPS 
systems. This problem is inspired by a number of similar problems found on the web, including from 
Yonsei University and University of Rhode Island. 
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11.2 Concept Map for Eigenfaces [30 mins] 


Let’s now switch gears. For the facial recognition project we will be primarily focusing on an early facial 
recognition software algorithm, Eigenfaces, which is still used for face detection, and introduces some other 
concepts that are extremely important in both facial recognition and other tasks. 

We would like you to spend some time developing an understanding of what you know, and what you 
don’t know about facial recognition using Eigenfaces. A good way to do this is to break down the concept 
until you get to the point that you have terms that you do know: 


1. Write the key term at the top or in the center. Circle it, since you don’t know it. 


2. Research it, and identify terms that are immediately associated with it. Write them down and connect 
them. 


3. Circle new terms you don’t understand, and break these down too. 







Special Relativity 


Lorentz transformation 
time dilation 
length contraction 





| 


reference Frames 


mass-energy equivalence 








mass energy 


speed of light 


Figure 11.1: If you were trying to break down special relativity, a portion of your breakdown might look like 
this... 


Once you’ve done your breakdown, try to make the following lists individually: 
1. Relevant fundamental mathematical terms that I don’t know 

2. Relevant fundamental mathematical terms that I do know 

3. Ideas specific to facial or image recognition that I don’t know 


4. Ideas specific to facial or image recognition that I do know 
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11.3. Appendix: Worked Example for LU Decomposition 


In case you are curious about LU decomposition, we have provided the following (optional) worked example 
of finding an LU decomposition of a 3 x 3 matrix. We would like to find the LU decomposition for the 
following matrix: 


2 3 1 
4 7 3 (11.2) 
6 13 10 
We can start by writing 
2 3 1 1 0 Of] [Ui Ur. Urs 
4 7 3 = Do 1 0 0 U2 Uo3 (11.3) 
6 13 10 L3, L32 1 0 0 Us3 
Multiplying everything out yields the following: 
Ui Uje2 Ui3 2 3 1 
L104, La Ui2 + Ur £91 U 3 + U23 =|4 7 38 
£3101, L310 12 + £32U22 £310 13 + L32U23 + U33 6 13 10 


We can just read off the first row, and then iterate through the remaining terms 


Uy, =2 
Uj2=3 
Ui3g =1 


L0y, =4 = Lg, =2 
L910 42 + Ugg = 7 => U2 = 1 
D103 + U93 = 3 => Uo3 = 1 

£3,0\, =2 = > [31 =3 

















[£31 U2 + L32U22 = 18 L329 
£31U 13 + L32U23 + U33 = 10 U33 
So the complete decomposition is 
2 3 1 1 0 O] {2 3 1 
4 7 3}]=]2 1 0; JO 1 1 
6 13 10 3.4 1] |0 0 3 


One application for LU decomposition is in computing the determinant of a matrix. The determinant of the 
product of two square matrices is the product of their determinants. The determinant of upper and lower 
triangular matrices are just the product of the diagonal entries. So the determinant of this 3 x 3 matrix is 
(1 x 1x 1) x (2 x 1 x 3) = 6, which can be confirmed on MATLAB. 


Solution 10.5 
C3Hg + 5O2 — 3CO2 + 4H2O 


Solution 11.1 


1. Let the distances from the satellites to (x, y, z) be denoted by dj, dz, d3. Since the signals 
propagate at the speed of light c, 
dy = 28.3¢ 
dy = 40.7c 
d3 = 4l1.1c 
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Additionally, we can find the distances from the satellites to the point (2, y, z) directly from 
the coordinates 





dy = (a — 1)? + (y— 2)? + 2 
do = Va? + (y — 1)? + (z — 2)? 
dg = /(e- 1? +y? + (2-2) 
Squaring and expanding the 6 previous equations yields 
d? = (28.3c)? 
d3 = (40.7)? 
d3 = (41.1c)? 














@a=a? -Ww+lt+y—4y4442 =27? -—WWw+y? —4y4+27+5 
dag? +y?—dyt1+22 4244502 +y? —2Qy +27 —4245 
dj=a2?-W+1t+yt2-424+4H=2?-Ww+y+27-424+5 




















Equating terms 


x? — Qe + y? — 4y + 2? J 





(28.3c)? 
(40.7c)? 


poe (41.1¢)? 


ag? — Io + y? +27 — 424 





15 = 
gy? —Q2y+27-—47+5= 
pé5 











Note that since the points are on earth’s surface, x? + y? + z? = 1. Thus, we have 
—2x — dy = (28.3c)? — 6 
—2y — 4z = (40.7c)? — 6 
—22 — 4z = (41.1c)? —6 


Writing this in a matrix-vector equation 


-2 -4 0] Ja (28.3c)? — 6 
0 -2 —4] |y| = | (40.7c)? —6 (11.1) 
Bo? i) nal le (41.1c)? — 6 


The MATLAB code to solve this is 

c = 0.047; 

A = [-2 -4 0; 0 -2 -4; -2 0 -4] 

b =[(28.3*c)42-6; (40.7*c)42-6; (41.1*c)42-6]; 
inv (A) *b 

ans = 

0.68 


0.72 
0.23 


2. Information from an additional satellite will allow you to solve this problem without assuming 
that signals were transmitted at time zero. You can assume that signals were transmitted at 
an unknown time ¢, and the d? term will become d? = c?(28.3 — t)?, with do, ds, d4 taking 
similar forms. You can then construct 4 equations involving the x, y and z variables, eliminate 
squared terms and solve for x, y and z. 
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? Learning Objectives 
Concepts 

« Describe how a vector can be used to represent a data set. 

- Explain how a matrix is used to represent multiple data sets. 

« Explain what is meant by vectorizing a grayscale image. 

« Predict the size of a vectorized image, given its pixel dimensions and color (gray or color). 
MATLAB skills 

« Convert a color image into a grayscale image. 


+ Convert an image to a matrix and back again 


12.1 Ethics, Artificial Intelligence, and Facial Recognition 


Face recognition is a technology with many possible applications. In just the past dozen or so years, the 
technology has gone from the stuff of science fiction to something that we interact with everyday (e.g., 
auto-tagging of images uploaded to social media). In this part of the assignment we are going to ask you to 
take a deep dive into how this technology manifests itself in the real world—often with mixed consequences 
for society. 
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This section is structured into three parts. First, we'll have you read about some of the issues that have 
been raised around face recognition technology (and more generally face analysis technology). Next, you'll 
read some frameworks that have been proposed to help mitigate the potential harm and maximize the 
benefits that might otherwise come from releasing poorly tested and biased AI systems. Finally, we'll have 
you branch out from face recognition technology to AI in general to examine which applications of the 
technology you think have the potential to most positively impact the world. You will discuss and synthesize 
your findings in class on Thursday, so make sure to take some sort of notes on what you read (there are also 
some specific prompts to respond to below). 


Face Recognition Technology 


Exercise 12.1 


1. For a good overview of the issues, we'd like you to read Joy Buolamwini’s written testimony 
that she then presented orally . You can pick whether you read the testimony or watch the 
video, although one nice thing about the written testimony is that it cites a lot of sources that 
you can read for more information. 


Based on this reading, generate a list of surprising insights (e.g., spurred by key quotes) that you 
gained. Also generate at least one discussion question. 


. For a discussion of gender recognition technology (based on images of people’s faces), please 
read Gender Recognition or Gender Reductionism?: The Social Implications of Embedded 
Gender Recognition Systems 


Based on this reading, generate a list of surprising insights (e.g., spurred by key quotes) that you 
gained. Also generate at least one discussion question. 





Frameworks and Guidelines for Responsible Machine Learning 


Exercise 12.2 


Face recognition technology falls under the umbrella of machine learning. Machine learning is a field 
concerned with creating technologies that enable computers to learn to perform tasks automatically 
from experience (e.g., recognizing someone’s identify from a picture of their face)—often by ingesting 
large training sets of labeled data. Sparked by a recognition that machine learning technologies 
were causing unanticipated harm in the real world, a lot of attention has been paid in recent years 
(both in industry and academia) to issues of fairness, accountability, and transparency. Here are two 
frameworks that have been created. 


» Principles for Accountable Algorithms 


* Google’s Inclusive ML 


Based on this reading, generate a list of surprising insights (e.g., spurred by key quotes) that you gained. 
Also generate at least one discussion question. 

To get a sense of all of the conversations taking place around this topic, check out ACM’s FAccT 
network of events. 
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Beyond Face Recognition 


One thing that is important to mention at this point in the module is that while we are learning linear algebra 
and data analysis techniques within the context of face recognition, what you are learning can be applied to 
innumerable applications and fields of study. Even if we just stay within the realm of artificial intelligence, 
what you are learning now (and will learn later in the course) is the bedrock of many AI algorithms that are 
used in all sorts of applications. When learning about all of the issues that a technology like face recognition 
has, we find that students can sometimes have a tendency to move towards a nihilistic perspective on 
technology as a whole (e.g., all technology is bad / harmful). Critiquing technology and its role and effect in 
society is absolutely vital for any engineer. However, we contend that trying to understand how technology 
can be developed in a way that minimizes harm while maximizing benefit (e.g., the frameworks from the 
previous section) or by applying technology to problems or domains that have great potential for positive 
impact is also crucial. In this section, we are asking you to look into applications of image analysis (or 
artificial intelligence more generally) that have the potential for great positive impact on society. 


Exercise 12.3 


Find an article or paper about an application of artificial intelligence (it could be specifically about 
image or face analysis, but it need not be) that you think has the potential for great positive impact on 
society. Come to class ready to summarize the application and why you think it has the potential for 
positive impact. Unpack the notion of positive impact by specifying what the benefits (or downsides) 
would be of the application and who would reap them. 

If you need some inspiration, here are some starting points (we are not claiming these are necessarily 
unambiguously positive, but they may provide some good starting points for your search). 


Automated diagnosis of cancer from medical images 
Automated, personalized education 


Optimizing energy use with artificial intelligence (more generally “Computational Sustainabil- 
ity”) 

Sensing for driverless cars (e.g., pedestrian detection, road sign reading) 

Recognition and reading of text in a camera feed for people who are blind 


Automated wildlife monitoring via image analysis 


This one is kind of cheating. Olin 2nd year Austin Vesiliza put together a list of links to AI for 
social good projects that you might use for inspiration. 
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12.2 Manipulating Images with Matrices 


Matrix of RGB values 


Image 
height in 
pixels 


Blue 





Green 


Red 


Figure 12.1: Anatomy of an RGB image array. 


Exercise 12.4 
Our next example is of an image pre-processing step that many of you would eventually do using 


built-in MATLAB functions before running your face detection algorithm. 


1. Read an image file using MATLAB, and convert it to double precision numbers (the data format 
that MATLAB uses by default for vectors and matrices) using the following code: 


X = imread(giraffe.jpg'); 


(If you get an error, try re-typing the apostrophes.) 


. Color images are stored in a 3-dimensional array (as opposed to matrices, which are 2- 
dimensional arrays) in MATLAB. Compare this to the smiley face image you saw in class 
which was a matrix whose entries are the gray-scale values. Here, instead of gray-scale values, 
the color information is stored in Red, Green and Blue entries of the three-dimensional array. 
Therefore, each pixel in the image is associated with three different values which indicate how 
much of Red, Green and Blue are present in that pixel. This array is illustrated in Figure 12.1. 


You can see the dimensions of this array using the following. 


size (X) 


. Display the image using 


imagesc(X) ; 


The image may be squashed; if you would like it not be be squashed, type axis equal into 
the command window. 
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4. What will the dimensions of the matrix with the grayscale representation of this image be? 


5. We will now use matrix manipulations to turn the image into a grayscale image. The RGB 
array can be separated into three slices, one for each color. For example, the red slice is all the 


data in the the first layer of the array: 
X_red=X(:,:,1); 
Converting a pixel to grayscale can be accomplished by taking a linear combination of the red, 


green and blue values of that pixel which are weighted by 0.2989, 0.5870 and 0.1140 respectively. 
Use these weights to create a linear combination of the red, green, and blue slices. 


6. Verify if this was done correctly by displaying the image using the following commands. 


imagesc(grayscalexX); colormap('gray'); axis equal 





a) 


12.3 Further Examples on Decomposition 


Exercise 12.5 


. In this problem, we are going to express the temperature data for four cities we encountered 
earlier using a given set of basis vectors. Load some sample temperature data in MATLAB 
by typing » load temperatures_and_bases.mat. Type whos at the MATLAB 
prompt to see all your variables. You should have a matrix T which has the temperature data 
for 1 year for the cities of Boston, New York, Washington DC and Providence in that order. 
Use the Size command to determine how the data are organized in this matrix. You should 
also have four vectors uj -- - u4 which a genie has provided to you. 


(a) Verify that the vectors uy, --- uy are all mutually orthogonal, and that they have unit 
length. 


(b) Set up and solve the linear algebra problem in order to express each column of the 
temperature matrix T as a linear combination of uj - - - u4. Check that you can undo this 
operation and retrieve the original data. 


(c) Now let’s reconstruct an approximation to the original temperature data, using only the 
vectors U4, U2, U3, What is the rms error for this approximation? 


(d) Compare the rms error for the previous scheme to a simpler scheme where in order to 
compress the data, we simply discard the temperature of Providence. When we want 
to reconstruct the data, we simply approximate the temperature in Providence by the 
temperature in Boston. 


Once again, we have to disappoint you by letting you know that there is no genie! There is 
just data. In the coming weeks, we are going to find out how to find bases vectors that can 
be useful for dimensionality reduction for a given set of random data, given some training 
data. This will be particularly useful in speeding up computations where instead of doing 
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computations on all the dimensions of the data we have, we perform computations on fewer 
dimensions. 


2. We will finally be dealing with images of faces. We are going to compress these face images in 
a similar way as the temperature data (we give you the bases). Here, the data have really high 
dimensions (each pixel is a dimension). The bases that we give you (matrix U) doesn’t span 
the entire high dimensional space (so there will be lossy compression). 


(a) Load the file face_bases.mat in MATLAB. You will see a 3-dimensional array 
test_images, of dimensions 256 x 256 x 424, and a matrix U of dimensions 65536 x 
424, The test _images array contains 424 grayscale images.Each image is 256 x 256 
pixels. 


Select any one image from the set of 424 and call it T. Display this image using » 
imagesc(T); colormap('gray'). This image is currently represented as a 
256 x 256 matrix of grayscale values. We will find it very convenient to work with 
vectors instead of matrices representing an image. Therefore, to make our lives simple, 
we will take the data for an image which is stored in a matrix and store it in a vector. 
We are going to vectorize this image by stacking its columns one on top of another to create 
a single vector that is (256)? x 1, i.e. 65536 x 1 which will be a lot easier to work with. This 
operation can be accomplished in MATLAB as follows: » Tstacked = reshape(T, 
65536, 1) ;. When you need to recover the unstacked version of the image, you can 
undo-the stacking as follows: » Tunstacked = reshape(Tstacked, 256, 
256). 


The matrix U contains a set of 424 65536 x 1 linearly independent vectors provided by 
the genie. Approximate the TstackKed vector as a linear combination of the first 10 
of columns of U, and call this vector Tapprox10. Tapprox10 should be a 65536 x 
1 vector, and you will only have 10 weight values to find this approximation. See how 
well this approximation works by reshaping Tapprox10 into a 256 x 256 matrix and 
displaying it using imagesc and colormap('gray'). 

(d) Now repeat the previous exercise with the first 50 columns of U and then again with the 
first 100 columns of U. 


You should observe that the more columns of U you use, the better the approximation. Note 
here that we are trying to approximate a 65536 dimensional vector using 10, 50 and 100 
numbers. Therefore, you should not expect the approximation to work super well, but with 
100 columns of U, you should be able to recognize the picture. At a later date we will quantify 
the fidelity of the approximation. 


Note that more sophisticated image compression algorithms use methods that rely on special 


properties of images and human vision in order to a achieve high degree of compression. 


CO 


12.4 Data: Many Measurements of the Same Thing 





One of the simplest forms of data is a set of data which represents many measurements of nominally the 
same thing. Depending on what the goal is of our analysis, this might encompass measurements of the same 
quantity across many different situations, or many instances of the same situation. 
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Visualizing Measurements of the Same Thing 


It’s usually a good idea to look at data before you start calculating things associated with it. 

You’ve surely encountered these ideas before, but for the sake of completeness, we'll highlight a couple 
of ideas here. If you have a large number of data points (say, for example, that you measured the heights of a 
bunch of different people), you might choose to simply plot the data versus the person number — the index. 
Note here that the data is plotted as individual points, since each point represents a measurement. Ideally 
we might also include error bars here to indicate our uncertainty in a given measurement, but for now, let’s 
leave that out. 


single-variable plot 
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Figure 12.4: An example of a single variable plot 


Alternatively, you could also visualize many measurements of the same thing by creating a histogram. 
This is a representation of how many measurements fall into different “bins”: the height of a given bar is the 
number of samples that fall within the range associated with the bar. For example, in the figure, you can see 
that about 20 million people made between 0 and $5000 in 2008. You've likely seen this kind of thing before 
as well: it’s not an uncommon way to represent test scores. 
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Figure 12.5: An example of a histogram. 


Note, of course, that how a histogram looks depends on what you choose for the bins - both how many 
there are, and where they are centered! 


Common Figures of Merit for the Same Thing 


While looking at the data is certainly helpful, we can also extract or calculate a couple of important figures 
of merit of the data. The first is the average, or mean of the data, given by summing all the elements in the 
dataset {d;} and dividing by the number N of elements in the set: 


1x 
w= a d; (12.1) 


Note that if our data is a continuous function f(x) over a range of the independent variable x as opposed to 
a set of discrete points, we can express the same thing as an integral: 


w= Srange f@)de (12.2) 


Tae dx 


The average captures the center or expected value’ of the distribution of data. In addition to this, it is often 
helpful to capture the spread of the data around this average. There are a few different metrics which are 
used for this. A simple one is the variance from the mean, 07: the average of the squared difference between 


each data point and the mean. 
N 


1 
—_ 7 2 
a dM 1) (12.3) 
Please note that this definition normalizes using N — 1, but you will often see alternative definitions which 
normalize using N. Another commonly encountered measure is the standard deviation, which is simply the 
square root of the variance from the mean: 





c= — So(di — p)? (12.4) 
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Exercise 12.6 


. Look at the single variable plot in Figure 12.4 above. Estimate the value of the mean and the 
value of the standard deviation. What are the units of each? 


. Look at the histogram plot in Figure 12.5 above. Estimate the value of the mean and the value 
of the standard deviation. 


. What is the mean and standard deviation of this data set (Do this in your head!) 


A 3, 1,3, il, 3, Al 3, Ils, 1, 3, 1,3, 1, 3, dl, 3, Lap 


. Begin by considering the simple dataset of the high temperatures in Needham for ten days in 
March: 
T = {57, 61, 46, 43, 46, 46, 54, 46, 46, 55} (12.5) 


(a) By hand, create a histogram of this data. What size bin makes sense? What bin centering 
makes sense? 


(b) By hand, compute the mean temperature over these ten days. If you look at the data, 
does this mean make sense? 


(c) By hand, compute the variance and standard deviation of the temperature over these ten 
days. If you look at the data histogram does this make sense? 


(d) This dataset has a flaw: it has a small number of datapoints. What do you see as the 


possible effects of having such a small sample? 


5. Now consider the larger dataset below of the approximated heights of the Olin faculty, measured 
in inches. In MATLAB, create a vector which has this dataset as the entries. 


H ={63, 66,71, 65, 70, 66, 67, 65, 67, 74, 64, 75, 68, 67, 70, 73, 66, 70, 72, 62, 68, 
70, 62, 69, 66, 70, 70, 68, 69, 70, 71, 65, 64, 71, 64, 78, 69, 70, 65, 66, 72, 64} 


(a) Computationally histogram this data. What size bin makes sense? What bin centering 
makes sense? Try a few different combinations. See MATLAB function histogram. 


(b) Computationally, find the mean, standard deviation, and variance of this dataset. See 
MATLAB functions mean, std, and var. 


(c) Does the mean, standard deviation, and variance make sense given the histogram of the 
data? 


CO 


12.5 Brightness and Contrast 





The brightness and contrast of images is controlled by scaling the histogram of the pixel values. Try this out! 
Note: for displaying images in this part, make sure to NOT use imagesc: imagesc is specifically setup to 
auto-scale the image to use the full range from 0 to 255. Just use the command ‘image’. 


Exercise 12.7 
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. Load an image of your choice into MATLAB using the 1mread command. (Make sure you 
are in the correct directory for the image or give it the complete path). Display the image 
using the ‘image’ command. 


. If your image is a color image, convert it into grayscale by using the rgb2 gray command. 


. Create a vector of the intensities in your image: use the reshape command to create a giant 
column vector in which the first n elements are the first column of the image, the next n are 
the second column, etc. 


. Make a histogram of the intensity values in your image. Note that the default variable type 
for image data is uint8 (8-bit unsigned integer) which is an integer that ranges from o to 255. 
Does your image use the entire range of values from 0 to 255? What is the minimum pixel 
value used? What is the maximum? 


. Find the mean of the intensities in your image data. Find the standard deviation. Is the intensity 
data well-centered on the available range? The location of the intensity data in the range 
determines the brightness of the image. How does the standard deviation compare to the 
available range? Does the intensity data span a good portion of the available range? This 
affects the contrast. 


. To adjust the brightness of your image, you can scale all of the intensity values by a multiplica- 
tive factor down (towards darker values) or up (towards brighter values). Based on looking at 
the histogram, should your image be brightened? Dimmed? Why? 


. To adjust the contrast, you make a linear mapping of the existing range onto the full 0 to 255 
range. In other words, if you think of the current intensity value as your independent variable 
x, and the new intensity value as the dependent variable y, a contrast adjustment is defined by 
a function y = f(x). Propose an equation for a line which gives you the “best” range of y’s, 
given the input intensity values in the image. You should be able to justify this based on the 
histogram of the image. Note that any values of y that end up below o should be interpreted 
as 0, and any values over 255 should be interpreted as 255. 


. Implement brightness and contrast adjustment: 


(a) Load a picture of a face. 
(b) Analyze the intensity histogram. 


(c) Calculate the adjusted face by applying both brightness and contrast adjustments to 
make it as “good” as possible. 


(d) Create a figure that includes four subplots: the original image, the original intensity 
histogram, the new image, and the new intensity histogram. 


9. What would happen if the function for contrast adjustment was not linear? Why might you 
choose a non-linear function for this mapping? 


—__-_a____nslSSSEH 


12.6 Conceptual Quiz 





Please complete the conceptual quiz on Canvas. 
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Solution 12.4 
1. 
2. The size of the array is 740 x 740 x 3. 


3. You should see the following picture: 





Figure 12.2: Giraffe 


4. The gray-scale version of this image is represented by a 740 x 740 matrix. 


5. Create the matrix grayscaleX which represents the grayscale version of this image 
using » grayscaleX = 0.2989*X(:,:,1) + 0.5870*X(:,:,2) + 
0.1140*X(:,:,3). 


6. You should see the following image: 
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Figure 12.3: Gray Giraffe 


Solution 12.5 


1. (a) To check that ul and u2 are orthogonal, type » transpose(ul) *uZ2. The result 
should be zero. Check the other five pairs of vectors. 
To check that u1 has unit length, type » transpose(u1) *u1. The result should 
be one. Check the other three vectors. 
Alternatively, if you build the matrix U that has wu in the first column, wz in the sec- 
ond column, and so on then we can simply examine U7U - if the columns of U are 
orthonormal then the product should be the identity matrix. So you could try typing » 
transpose (U) *U. 

(b) First we need to construct a matrix U with columns each of the u,; 

» U=[ul u2 u3 u4], 
and convert T to the basis of u; vectors by multiplying » Tu=transpose(U)*T. 
You can recover the original data with » U* Tu. 


(c) The solution is the same as the previous part except we only include uj, ue, u3 when 
constructing our matrix U. 
» U=[ul u2 u3], 
and convert T to the basis of u; vectors by multiplying » Tu=transpose(U)*T. 
You can recover the original approximately with » Tapprox = U*Tu. 
The RMS error can be computed as follows. 
» E = Tapprox - T; 
» RMS = sqrt(mean(E.2, '‘'all') 
0.805 degrees Fahrenheit 
(d) We can calculate our approximate version of the temperatures and the RMS with the 
following code.» Tapprox = [T(1:3,:); T(1,:)] 
» E = Tapprox - T; 
» RMS = sqrt(mean(E.2, '‘'all') 
1.2311 degrees Fahrenheit 


2. (a) The exercise involves trying out some steps that are given, therefore a solution doesn’t 
really make sense. 
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(b) The exercise involves trying out some steps that are given, therefore a solution doesn’t 
really make sense. 


(c) Isolate the first 10 columns of U using » U10=U(:,1:10);. Then determine the 
weights for each of these column vectors using 
» Tweights10=transpose(Tstacked) *U10; 
and then take the linear combination 
» Tapprox10=U10*transpose(Tweights10) ; . Then we unstack the vec- 
tor into a matrix 
» Tapprox10unstacked=reshape(Tapprox10, 256, 256) ; and display 
the image. 


(d) The solution here is the same as part (c), except you would populate use UC: , 1:50) 
or U(: , 1: 100) to extract the relevant genie vectors. 


Solution 12.6 


1. Assuming that the “heights” plotted are heights of randomly selected humans, then the unit 
for the mean and standard deviation is inches. 


2. We can create a vector with the numbers of people in each bin by estimating from looking at 
the histogram. Create a vector to store this information in MATLAB. 
» numPeople = [20; 22; 23; 18; 17; 15; 14; 12; 10; 8; 9; 5; 
63 2505 453525 25 1514 )5 
Then, create a vector of the bin centers. Here, we use a (conservative) estimate that the bin 


center for the "over 100k" group is 200k. (You will get a different answer if you chose a different 
value) 


binCenters = [[2500:5000:97500]’;200e3]; 
mu = sum(numPeople. *binCenters)/sum(numPeopl1le) 
which gives something on the order of 41k. For the standard deviation, we do the following 


Ssigma=sqrt (sum(numPeople.* ((binCenters-mu) .2)) 
/(sum(numPeop1le) -1)) 


This leads to a standard deviation of around 48K! 


3. Since half the digits are 1 and the other half are 3, the mean will be the average of 1 and 3, so 
jt = 2. Looking at the formula for standard deviation, we can see that d; — = 1 for each 


data point, so 0 = \/20/19. 


4. (a) There are many ways to pick a reasonable set of bin centers and widths. E.g. one could 
pick 3 bins, centered at 46, 52, 58. 
(b) We compute ps = 50. 
(c) We compute o = 6.15. 


(d) The data could be skewed. In this example, it appears that summer and winter tempera- 
tures are not represented in the data set. 


5. (a) By simply entering » histogram(H), MATLAB automatically chooses bins of size 
one. 


(b) Using MATLAB we find that p: = 68.1429, o = 3.5721 and o? = 12.7596. 


(c) If we look at the histogram with bin sizes 2 as shown in the following figure, we see that 
the mean height is around 68. Additionally, there is not a siginificant variation in the 
data (unlike the incomes data you saw earlier), so we expect the standard deviation to be 
small relative to the mean. 
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Solution 12.7 


A sample solution LiveScript is here. To run the solution, you will need this image. 
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13.7 OPTICs discussion (as a class) (10 minutes) ...........----+++22005 125 





13.1 Finish up orthogonal projection exercises (15 minutes) 


We'll take the first 15 minutes of class to finish up the problem set that we were working through last time. 


13.2 Discussion Framing (5 minutes) 


Today we'll be talking about a constellation of issues that arise when AI technology, like facial recognition, is 
deployed in society. As the historian Melvin Kranzberg famously remarked, “Technology is neither good nor 
bad; nor is it neutral” As you saw in the readings from the homework, the effect of AI technology in society 
intersects a number of sensitive issues around race, class, and gender. Due to intersection of AI and these 
sensitive issues, it helps to take a few minutes to consider some guidelines for having fruitful discussions at 
your tables. 


« Check out this poster put together by some Oliners with suggestions for having conversations on 
sensitive topics. 


+ The readings provide information and framing, which we find is very helpful to finding common 
ground when discussing issues that individuals may relate to in very different ways. 


« As you may be relatively new to these ideas, consider adopting a mindset of identifying key questions 
rather than drawing conclusions. 


+ When talking about the effect of a technology on a group that has been historically marginalized and 
if you are not a member of this group, you should be particularly sensitive in these discussions. Be 
conscious of the ways in which your words might be experienced by those who may have faced a 
history of discrimination due to being a member of this group. 
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13.3 Unpacking the Readings (15 minutes) 


Write down key concepts and clear up points of confusion on the readings. Here are some specific prompts 
you might use to spur discussion in your group. 


1. What parts or quotes from the readings were most surprising / impactful to you? 


2. Were you surprised by your reaction to reading any of the material (e.g., felt unexpectedly angry, sad, 
indifferent)? 


3. What are the big questions that have been raised for you (these could be things that were already on 
your radar or new ones entirely)? These questions could relate to our society as a whole, your role as 
a citizen within society, your role as an Olin student, your future career path, etc.). 


4. How do these readings intersect with knowledge you’ve gained from other contexts (e.g., in other 
courses or in your daily life experience)? 


As a reminder, here are the links to the readings. 


+ Joy Buolamwini’s written testimony on bias in facial recognition technology (you may have watched 
this instead). 


« Gender Recognition or Gender Reductionism?: The Social Implications of Embedded Gender Recogni- 
tion Systems 


* Principles for Accountable Algorithms 


* Google’s Inclusive ML 


13.4 Share Your Positive Application of AI (10 minutes) 


Go around and share the application of AI that you think has the potential for great positive impact on 
society. Say a little bit about what you learned and how you think it would have a positive impact (e.g., in 
what ways and for whom). 

Discussion Prompts 


« Based on the applications shared, are each of these applications universally positive or are there some 
where one group would suffer so that another may benefit? As a future technologist (someone who 
will create, work with, or otherwise be an expert on technology), what questions does this potential 
tension bring up for you? 


+ What (if any) common themes are there for the applications that folks came up with (e.g., particular 
domains the applications are for, particular intended user groups, etc.)? 


13.5 Olin Principles of Technology, Innovation, and Consequences 
(OPTICS) (25 minutes) 


Next, we’re going to run through an activity developed by Caitrin Lynch and Rob Martello that will help us 
try to extract principles for creating technologies in a way that considers consequences and societal benefit / 
harm. The prompts are in the OPTICs Google Slide Deck 


13.6 Read OPTICs of Other Groups (individually) (10 minutes) 


Take a look through the OPTICS that other groups created. Take a look and note anything interesting. You'll 
have a chance to share your insights with your group. 
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13.7. OPTICs discussion (as a class) (10 minutes) 


This is your chance to share your takeaways from this activity with the rest of the group. We'll do a brain 
dump in the Zoom chat and then give folks a chance to share their thoughts verbally. You can comment on 
your group’s OPTICS, similarities / differences with other groups, or whatever is on your mind! 
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14.1 Debrief [15 mins] 


+ Please discuss your work with the folks in your breakout-room, and get help with the ideas that you 
are still confused by. 


In the next set of exercises we will explore a common method for finding “the” solution of a linear system 
of algebraic equations (Ax = b) in the case where there are more equations that unknowns (more rows 
than columns). We will first need to synthesise some previous ideas about the span of vectors. 


14.2 Range of A [15 mins] 
We discussed earlier the concept of the span of a collection of vectors. Recall that the span of a collection of 
vectors is the set of all linear combinations of the vectors. Now we will apply this concept to the columns of 


a matrix: 


Definition: The Range of a matrix A is the span of its columns. 


Exercise 14.1 


Describe in words the Range of the following matrices: 


asf 
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14.3 Exact Solution to Ax = b [15 mins] 


When does a linear system of algebraic equations, Ax = b, have a solution? Since the product Ax is a 
linear combination of the columns of A, then Ax = b will have a solution if and only if b is in the Range of 
A. Think about that, and complete the following exercise. 


Exercise 14.2 


Which of the following linear systems of algebraic equations will have a solution? Think about it 
from an equation perspective and the Range of A perspective. 


fia 


Vefen4 


il 2 3 
3 4] andb= | 7 
5 6 11 


3 
and b= |7 
5 





14.4 Approximate solution to Ax = b [30 mins] 


You should have found that some of these systems do not have a solution in the usual sense, ie. there is no 
vector x which makes the equation Ax = b true. We might refer to such a solution as an exact solution. We 
will now consider an approximate solution, i.e. a vector x which approximately satisfies Ax = b. We will 
consider a particular approximation based on orthogonal projection now, and later in QEA we will look at 
this approximation from a different perspective where it is known as the Least-Squares approximation. We 
met orthogonal projection earlier in the module when we spoke about vector components and basis vectors. 


Exercise 14.3 


Hold your hand up in front of you, and think about it as occupying a location in 3D. 


1. Point to the location on each of the walls surrounding you that is closest to your hand. 


2. Point to the location on the floor that is closest to your hand. 
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3. In your other hand, hold a flat object (like a piece of paper or a book) at some angle. Now 
imagine extending the surface of this object so that it is larger than the room you are in. Now 
point to the location on the extended flat surface that is closest to your hand. 


4. What do you notice about the relationship between the “pointing” vector and the surface being 
pointed at? 





———————————EE, 


Now let’s put this in the context of solving Ax = b. 


« If bis not in the Range of A then we will define an approximate solution by orthogonal projection of 
b onto the Range of A. 


* The “pointing” vector from b to the relevant point in the Range of A is Ax — b. 


« Since the Range of A is defined by the span of the columns of A then the “pointing” vector must be 
orthogonal to every column of A. 


- This implies that A7 (Ax — b) = 0. (Think about why this must be true). 


- Re-arranging this equation leads to A? Ax = A’b. The matrix A? A is a square matrix (which we 
will meet again and again this module). 


« This is a linear system with equal numbers of equations and unknowns and can therefore be solved 
using our usual techniques. Did you get that? You should re-read this paragraph a few times. To 
summarize: 


The approximate solution to Ax = b based on orthogonal projection can be obtained by solving 
ATAx= Ab 


This solution is also known as the least-squares solution because it minimises the distance 
between b and the Range of A (more about this later). 


Warning: Do not think about x defining a coordinate system that b lives in! When you draw a 
picture you should think about the space that the columns of A live in. We are projecting b 
onto a basis defined by the columns of A. The solution vector x is better thought of as a set of 
“weights” or “coordinates” with respect to this basis. 


Exercise 14.4 


1. Consider the linear system Ax = b where A = Hl and b = H . (You've already thought 


about this earlier). 


(a) Sketch the Range of A and locate the point in the Range that is closest to b. 
(b) Multiply both sides of Ax = b by A? and solve the resulting linear system. 
i 2 3 
2. Consider the linear system Ax = b where A = |3 4] and b = |7]. (You've already 


5 6 5 
thought about this earlier). 


(a) Sketch the Range of A and locate the point in the Range that is closest to b. 
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(b) Multiply both sides of Ax = b by A” and solve the resulting linear system. 





CO 


14.5 Solving Ax = b in Matlab [10 mins] 

In many ways Matlab makes life easy for us. There is a single command in order to solve a linear system 
Ax=b 

>> x = A\b 

although it can also be used by typing 

>> x = mldivide(A,b) 


If there are more rows than columns then Matlab finds the approximate solution we discussed above. If there 
are equal numbers of rows and columns then Matlab computes a solution by LU decomposition. If there 
are less rows than columns then Matlab computes one of the infinite number of solutions - the solution it 
computes is not an approximation but it does select the solution that minimizes the length of the solution 
vector. 


Exercise 14.5 


For each of the linear systems in Exercise 14.4 please find the solution in Matlab using A \b. 
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Solution 14.1 
1. The column is a two-dimensional vector. The span is a line (slope = 1) in 2D space. 


2. The columns are linearly-independent three-dimensional vectors. Their span is therefore a 
plane in 3D space. Since all the z-entries are zero, the plane is actually the xy-plane. 


3. The columns are linearly-independent three-dimensional vectors. Their span is therefore a 
plane in 3D space. The plane is defined by the column vectors. 


Solution 14.2 


1. The Range of A is all multiples of | Since b is a multiple of this vector then there is a 


solution. From an equation point of view, the solution is simply x = 5. 


2. The Range of A is all multiples of Hl . Since b is not a multiple of this vector then there is no 
solution. From an equation point of view this makes sense because we are demanding that 


x = 2and 2 = 3 at the same time. 


3. The Range of A is a plane in 3D. Since b is the sum of the columns it must be in the Range of A 
and so there is a solution. From an equation point of view there are two linearly-independent 
equations in two unknowns. 


4. The Range of A is a plane in 3D. Since b is not in this plane there is no solution. From an 
equation point of view this makes sense because trying to solve the equations results in an 
inconsistency. 


Solution 14.3 


In each case the “closest” point is the location where the “pointing” vector meets the surface at right 
angles, i.e. they are orthogonal. 


Solution 14.4 


1. Consider the linear system Ax = b where A = Hl and b = | . (You’ve already thought 


about this earlier). 
(a) Sketch the Range of A and locate the point in the Range that is closest to b. (The Range 
is a straight line and the point is the orthogonal projection onto this line.) 


(b) Multiply both sides of Ax = b by A” and solve the resulting linear system. (You should 
find that 7 = 5/2). 


1 2 3 
2. Consider the linear system Ax = b where A= |3 4] and b = |7}]. (You've already 
5 6 5 


thought about this earlier). 
(a) Sketch the Range of A and locate the point in the Range that is closest to b. (The Range 
is a plane in 3D and the point is the orthogonal projection onto this line.) 


(b) Multiply both sides of Ax = b by A” and solve the resulting linear system. (You should 
find that « = —3 and y = 7/2.) 
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Solution 14.5 
1.>> A = [131] 


[233] 


Vv 
Vv 
io” 
ll 


N 


>> A\b 
ans = 


2.5000 


2 >> A= [1 2;3 4;5 6] 


A = 
1 2 
3 4 
i) 6 


>> b = [33735] 


b = 
3 
7 
5 
>> A\b 
ans = 
-3.0000 


3.5000 
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? Learning Objectives 


Concepts 


« Describe the physical significance of the mean and standard deviation of a data set. 


Describe the physical significance of correlation, anti-correlation or non-correlation of two 
variables. 


« Approximate the mean and standard deviation from a histogram of the data. 


Interpret the meaning of a pair of images that has a Pearson Correlation Coefficient of: about 
0; Or 0.5; Or 0.9. 


Interpret the physical/mathematical meaning of the diagonal and off-diagonal elements in a 2 x 
2 correlation matrix, C = A” A, if given the equation for the Pearson Correlation Coefficient. 


MATLAB skills 
+ Compute the dot product of two vectors 


« Set up the appropriate matrices to compute the correlation coefficient between two variables. 


15.1 Correlation 


Now let’s consider that we have N measurements of two different associated quantities and want to test 
whether these are linearly correlated (if one goes up, the other also goes up), anti-correlated (if one goes up, 
the other goes down) or uncorrelated (the behavior of one cannot be predicted by watching the behavior of 
the other). (Please note that correlation has nothing to do with causality!). There are many different measures 
of correlation, but we will discuss here one of the most common, the Pearson Correlation Coefficient. 
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For a pair of associated datasets X = {x;} and Y = {y;}, each with N elements, we define the Pearson 
Correlation Coefficient to be: 


N 
p(X,Y) = — >> e — Ha) (9 — Hy) (15.1) 


oy 
i=1 y 





where jz, {4y, ©, and o,, are the means and standard deviations of the datasets. Essentially, for each pair of 
values, we take the product of the variations from the mean, then sum these products up over all pairs of 
values and normalize by the expected variation as characterized by the standard deviation. If the two values 
are consistently always on the same side of the mean, then each term in the sum will contribute positively, 
and the total value will be close to +1, indicating positive correlation. If the two values are consistently on 
the opposite sides of the mean, then each term in the sum will contribute negatives, and the total value will 
be close to —1, indicating anticorrelation. If, for every pair, it is just as likely that the two values will be on 
opposite sides of the mean as on the same side of the mean, then the sum will go to 0, and the two values 
are uncorrelated. 
Consider the following data: 





A B c D E F G H 1 J 
3 Poverty _ Infant Mort White Crime Doctors TrafDeaths University Unemployed Income 
4 Alabama 15.7 9.0 71.0 448 218.2 181 22.0 5.0 42,666 
5 Alaska 84 69 70.6 661 2285 1.63 273 67 68,460 
6 Arizona 147 64 86.5 483 209.7 1.69 25.1 5.5 50,958 
7 Arkansas 173 85 80.8 529 203.4 196 18.8 51 38,815 
8 California 13.3 5.0 76.6 523 268.7 121 29.6 72 61,021 
9 Colorado 11.4 57 897 348 259.7 1.14 35.6 49 56,993 
10 Connecticut 93 62 843 256 376.4 0.86 35.6 57 68,595 
11 Delaware 10.0 83 743 689 250.9 1.23 275 48 57,989 
12 /Florida 13.2 73 79.8 723 247.9 156 25.8 62 47,778 
13 Georgia 147 81 65.4 493 217.4 1.46 275 62 50,861 
14 Hawaii 91 5.6 297 273 317.0 1.33 29.1 3.9 67,214 
15 Idaho 12.6 68 94.6 239 168.8 1.60 24.0 49 47,576 


Exercise 15.1 
1. Look over the data. By eye, which columns look correlated? Anticorrelated? Uncorrelated? 


2. Choose your two favorite columns of data from this dataset. Input these into vectors in Matlab. 
For each of these vectors, subtract off the mean, and then divide out the standard deviation. 


3. With these vectors, how would you directly compute the correlation coefficient between them? 
Go ahead and do this in MATLAB, and reflect on your result. Don’t forget to normalize by 
1/(N — 1). 


Exercise 15.2 


A note of warning. Correlation does not imply causation! To drive this point home, visit the Spurious 
Correlation Website. Follow the link at the bottom of the site to discover and plot a spurious 
correlation of your very own. 
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Correlation: The Idea, the Matrices, and the MATLAB 


Now we are going to use matrix mathematics to construct correlation coefficients in an efficient manner. 
Let’s first consider a data matrix B which has two columns of data, each of which has N samples: 


T VY 
x2 Y2 
B=] 273 y3 
XN YN 


If we subtract out the means and divide out the standard deviation and a factor of \/ N — 1, we get the matrix 
A: 


T1i-be Yi—by 





Ox Oy 

%2-Mae ya-by 
1 Ox Oy 

A= £3—-be Y¥3—My 
N us, 1 Or Oy 


LN TMLee YN—Hy 
Or Oy 


where fz, [4y and o;,0y are the mean and standard deviations of each column, and N is the number of 
samples (rows). 


The correlation matrix C = AT A has elements of the self and cross correlations between the 
datasets. 


Exercise 15.3 
1. Before we start to use this idea, let’s think it through a bit... 


(a) What is the size of the matrix C? 
(b) What do the elements on the diagonal of this matrix represent? What must their values 
be? 


(c) What do the elements of the off-diagonal represent? What is element C12 of this matrix? 
What is element C21? What do you notice? Is this always going to be true? What about 
if we had three datasets? What can you say about the elements C13 vs C31? 


(d) If you create a data matrix that has completely identical columns of data, what should 
the correlation matrix look like? 


(e) If you create a data matrix that has completely uncorrelated datasets, what should the 
correlation matrix look like? 


Exercise 15.4 


In this exercise we want you to calculate the correlation matrix in MATLAB using the 2 favorite data 
vectors you chose earlier. 


1. In MATLAB, construct the correlation matrix C using the method described above. 


2. Check your results by using the MATLAB function corrcoef. Please note that the input to 





CHAPTER 15. HOMEWORK 5: DATA, CORRELATION, AND SMILE DETECTION 139 


this function is the original data vectors. 
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15.2 Correlation in Facial Recognition 


Kinds of Correlation in an Image Set 


If we think about photos now, we can think about two different correlations: the correlation between a 
given pair of pixels (across all the pictures in a data set), and the correlation between photos (across all the 
pixels in those images). In order to compute an accurate correlation coefficient, you need to have multiple 
data points in each set being correlated, e.g., many pixels in each picture being correlated, or many pictures 
across which a pair of pixels (pixel locations) can be correlated. 

Think about what each of these correlations means. What would a high correlation between a given pair 
of images mean? What about a high correlation between a given pair of pixels (e.g., the upper-left-most 
pixel and the upper-right-most pixel)? It might help to open a few face images or draw some face sketches 
to think about. 


Exercise 15.5 


Consider six grayscale pictures, each with a resolution of m x n pixels. 
. What is the size of the data matrix containing these six pictures as the columns? 


. What is the expression for the correlation matrix between the pictures? What size is this 
correlation matrix? 


. What is the expression for the correlation matrix between different pixels? Pay careful attention 
to the mean and standard deviation you are using. What is the size of this correlation matrix? 


. People’s faces are approximately left-right symmetric. How would you expect this to affect 
the entries in the correlation matrix between different pixels? 





Test your understanding... 


+ Pull in six images from the class data matrix from last year’s QEA (current second years). The data 
should be in the (test_images variable in face_bases. mat file). Take the six images and 
put them in a variable called faces. Each of these images should come from different people - there 
are 8 images per person stored in the data matrix. 


» Use the resample command to bring them down to a smaller resolution (e.g., 25 x 25) using 
dfaces = imresize(faces,[25 25]); 
his should be a 25 x 25 x 6 matrix. 


* Now reshape them appropriately to create a matrix in which each column is a (reshaped) face using 
rdfaces = reshape(dfaces,size(dfaces,1) * size(dfaces,2),size(dfaces,3)); 
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Exercise 15.6 


1. Find the correlation between six different images. Which images have the highest correlation? 


2. Now find the correlation between pixels across images. Try taking a single column of this 
matrix and reshape that column into an image. What does that image tell you? You may want 
to repeat this reshape and visualization for columns 1, 25, 400, and 625 to get a feel for what is 
happening. 


SE aSa__a-_=ss5 


15.3 Smile Detection—Concepts 





In this section we are going to use our toolbox of linear algebra skills to “detect” whether or not a person is 
smiling in a photograph. The approach that we will take is very common in machine learning - we will 
use a dataset to train our algorithm, and we will use a different dataset to test our algorithm. We will first 
develop the conceptual framework and then implement the approach in MATLAB. 


The Big Idea 


Let’s assume that we have 100 training photos of faces, each consisting of a 5 by 5 grid of pixels. Let’s pack 
these into a matrix A with 100 rows and 25 columns, ie. every row is a different face and every column is a 
different pixel. 

Let’s also assume that we have already classified every training face as “smiling” or “not-smiling”. Let’s 
create a column vector b with 100 rows (corresponding to each face) which has either 1 (smiling) or 0 
(not-smiling). 

Let’s now develop a linear system of algebraic equations by trying to express the vector b as a linear 
combination of the columns of A, i.e. 


Ax=b 


Notice that the vector x is a column vector with 25 rows - one row for each pixel. Since there are more 
rows than columns we know that an exact solution does not exist, so we will find the approximate solution 
by orthogonal projection, i.e. we will solve 


ATAx= Ab 
for the unknown vector x, which on paper takes the form 
x = (A?A)!A7b 
Now that we have the vector x, let’s use it to detect whether a test image is smiling. Assuming that the 
test image is packed into a single row vector t (with 25 columns) then the product 


tx 


will return a scalar. If this scalar is close to “1” then we predict the face is smiling. If this scalar is close to 
“o” then we predict the face is not smiling. 


Exercise 15.7 


In this exercise you will be carefully reading and interpreting this big idea. We are including these 
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questions as a scaffold, pointing out interesting features along the way. 


. Read “The Big Idea” again! 


. Interpret what it means to write down the linear system of equations Ax = b and give a 
meaning to the vector x. 


. Interpret the product A? A and the product A7b. 


. The vector x does not satisfy Ax = b exactly. What does the expression Ax — b tell you? 


. How would you decide whether your “trained” algorithm was worth using on a test dataset? 


. Assume you had 4o test images with 25 pixels each and that you pack them into a matrix T 
with 40 rows and 25 columns. Write down the matrix-vector product you would use for smile 
detection on this test dataset. 


. How would you measure the accuracy of your predictions if we also provided you with the 
data on whether each test image was smiling or not? 





15-4 Smile Detection—Implementation 


Please download the file smiles .mat from the canvas site. If you load this file in MATLAB, you will then 
have access to the following variables in your workspace. 


train data - 


smile_flag_ train 


test_data - 
smile_flag_test 


a 3D array containing 19685 24 x 24 pixel 
images of faces 

a vector of the same length as the number 
of images in train_data, with 1s indicating 
which images are smiling 

500 24 x 24 pixel images of faces 

a vector of the same length as the number 
of images in test_data, with 1s indicating 
which images are smiling 


The ‘train data’ and the associated ‘smile flag train’ are the sets of data you should use to develop your 
mathematical model. The ‘test data’ and its associated ‘smile flag test’ are the sets of data you should use to 
test your algorithm when you are finished! 


Exercise 15.8 


Now you are going to implement a smile detector in MATLAB. You should consider following the 
procedure below to implement the smile detector. 


1. Sketch out a set of steps you would take in order to implement smile detection. (Just words 
here - no code. e.g. we will have to pack all images into a single matrix) 


2. Turn this set of steps into MATLAB pseudo-code. Identify important coding elements without 
implementing, e.g. we will use reshape to pack the given dataset into a matrix. 


3. Review the documentation for MATLAB functions that will be used and be clear on how to 
use them before implementation, e.g. » help reshape 
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4. Methodically implement smile detection in MATLAB, testing as you go. 


Alternatively, you can use our walkthrough notebook. The notebook has embedded solutions or you 


can try it with minimal scaffolding using the suggested process above. Even if you decide not to use 
the walkthrough notebook, it’s worth running the embedded solutions to pickup some techniques 
for visualizing your smile detector model. 


ee) 


15.5 Conceptual Quiz 





Please see Canvas for the questions. 
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Solution 15.1 


1. We are interested in the relationship between poverty and infant mortality. Generally speaking 
it looks like high values of one correspond to high values of the other, and vice versa, so they 


would 


seem to be correlated. 


2. We are going to define these as column vectors in MATLAB as follows (just using first 6 
observations for simplicity) 


>> X 
>> Y 


= [15.7;8.43;14.73;17.33;13.3;11.4]; 
= [9.0;6.93;6.43;8.535.0;5.7]; 


Now we need to normalize each one - it is easier to use built-in MATLAB functions to find the 
mean and standard deviation. 


Xn = 
Yn = 


(X - mean(X))./std(X) ; 
(Y - mean(Y))./std(yY) ; 


3. To find the correlation coefficient we need to multiply every entry in Xn and Yn together and 
then add them up. That sounds like a matrix operation that can be implemented as follows 


>> coeff = transpose(Xn)*Yn./5 


where 


we have normalized by N-1. The result is 0.5191, which indicates a substantial correla- 


tion between poverty and infant mortality. 


Solution 15.3 


1. (a) Since A has size N x 2, we know A” has size 2 x N. Then C = A’ A has size 2 x 2. 


(b) The elements on the diagonal represent the self-correlation of each data column. Each 
element of the diagonal will be 1. 


(c) The elements C12 = C2; is the correlation between the two columns of data. Regardless 
of size, the correlation matrix will be symmetric. 


(d) If the data matrix had identical columns of data, the correction matrix would be all 1s. 


(e) If the data is uncorrelated, the off-diagonal entries in the correlation matrix will be all os, 


Ss 


o it will be the identity matrix. (Note: real data is unlikely to have o correlation, just by 


accident, so it will just have numbers that are close to o.) 


1. There 


Solution 15.4 


are a few ways to do this. Here is one of them, where we use "mean" and "std" on the 


original data vectors 


Xx 
Y 
Xn = 


QO} 
nou 


[15.7;8.4;14.7;17.3;13.3311.4]; 
[9.0;6.9;6.43;8.535.0;5.7]; 

(X - mean(X))./std(X); 

(Y - mean(Y))./std(yY) ; 

[Xn Yn]./sqrt(5); 
transpose(A)*A 


1.0000 0.5190 
0.5190 1.0000 
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An alternative is to use "mean" and "std" on a data matrix. This works because both of these 
functions will return a row vector containing the mean and standard deviation alone each 


column. 
X = [15.7;8.4;14.73517.3;13.3311.4]; 
Y = [9.0;6.936.438.5;5.0;5.7]; 
B= [X Y]; 
A = (B-mean(B))./std(B)./sqrt(5); 
C = transpose(A)*A 
c= 
1.0000 0.5190 
0.5190 1.0000 


2. The function corrcoef accepts the original data vectors as input 


x [15.73;8.43;14.73;17.33;13.3;11.4]; 
Y = [9.0;6.936.43;8.5;5.0;5.7]; 
C corrcoef (X, Y) 


C.:= 
1.0000 0.5190 
0.5190 1.0000 


It would be worthwhile reviewing the documentation for this function by typing » doc 
corrcoef. 


Solution 15.5 


1. Each picture is represented by mn data points. So the data matrix containing these six pictures 
as columns is mn x 6. 


2. To find the correlation we can either use the MATLAB code we developed earlier or the 
command corrcoef. The correlation matrix will be 6 x 6. 


3. To find the correlation between pixels, we need to take the transpose of our data matrix. This 
new data matrix will be 6 x mn, since we have mn variables (each pixel) and 6 observations 
(within in picture). The correlation matrix will then be mn x mn. 


4. High correlation between pixels equidistant from centerline. 


Solution 15.6 


1. To find the correlation matrix between the six images, enter 
» corrcoef (rdfaces). 


2. To find the correlation across pixels, enter 
» pixels=corrcoef (transpose(rdfaces) ) ;. We can reshape the first column 
of this matrix and convert it into an image using 
» pixelsl=reshape(pixels(:,1),25,25); 
» imagesc(pixels1). 
The (i, j) entry of this image tells us how similar the top-left pixel is to the (7, 7) pixel. (Note: 
the pixels are “numbered” 1-625 going down the first column, then down the second column, 
and so on.) Repeating this reshape and visualization procedure on column 400 will give you 
the correlation between pixel 400 and each of the other pixels, for example. 
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Solution 15.7 
1. Read, read, read .... 


2. We are trying to take a linear combination of the data in order to predict whether each image 
is smiling or not. The vector x is the magic set of weights we have to use. Its size is the same 
as the number of pixels, so maybe it should look like a mask that we can place over an image 
to tell us whether it is smiling. 


3. The product A’ A is like a pixel to pixel correlation matrix, except we haven’t scaled the data 
matrix A. The product A’ b is the sum of the images that are smiling. 


4. The expression Ax — b tells us the error in predicting whether a training image is smiling or 
not. 


5. We could add up how often the predictor is correct and divide by the number of images to get 
an estimate of the accuracy. We would decide on a cut-off before we used it on a test dataset. 


6. Itis simply Tx. 


7. As before. Determine how many we got correct and average it. 


Solution 15.8 


You can use the solutions that are embedded in the walkthrough notebook. 


Chapter 16 


Week 6a: Eigenvalues and 
Eigenvectors 





Schedule 
a6.0. Debrrel [40 mis] ec eae ee eh ee Bea we eR me de ae ae 142 
16.2 Introduction to Eigenvalues and Eigenvectors [45 mins] ............... 142 





16.1 Debrief [30 mins] 


« Please discuss your homework, and get help with the ideas that you are still confused by. 


16.2 Introduction to Eigenvalues and Eigenvectors [45 mins] 
We are now going to learn the secret of the genie ... 


Eigenvalues and Eigenvectors: Definition and Notation 


Consider a square n x n matrix A. A vector v is said to be an eigenvector of A with corresponding eigenvalue 
A if v is not a vector of all zeros, and 


Av = Dv. (16.1) 


If we treat A as a transformation matrix then v is an eigenvector of A if it is simply scaled when acted 
on by the matrix A. In other words, v does not change direction when acted upon by A. In general, an 
n X n matrix has exactly n eigenvalues (although some of these may be repeated and some of these may be 
complex!). Note that any scalar multiple of an eigenvector of a matrix is also an eigenvector of that matrix - 
it’s only the direction of the eigenvector that matters. 

In the next homework assignment we are going to develop formal techniques for finding the eigenvalues 
and eigenvectors of matrices. For now, we are going to focus on concepts and developing some intuition. 





Exercise 16.1 


BR 


1. Show that v = | is an eigenvector of the following matrix by computing the product Av, 


Sl 
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and find the corresponding eigenvalue by expressing this new vector as a multiple of v. 


A= E 4 (16.2) 


1 


2. On the same axes, plot the vector representing v = | and Av. Does the plot confirm that 


V2 
v is an eigenvector? Did you get the correct eigenvalue? 


ay 
3. On the same axes, plot the vector representing u = Ke | and A.u. Is this is an eigenvector of 


A? 


“ee 


Eigenvalues and eigenvectors of a diagonal matrix 


2 0 
a-[o 
scales vectors by a factor of 2 in the x-direction and by a factor of 3 in the y-direction. Thus a vector that 


had a non-zero component only in the x direction will be scaled by a factor of 2 when transformed by this 


ofa Aa = Bisan 





Recall from our earlier work that the matrix 


matrix. In other words, \; = 2 is an eigenvalue with corresponding eigenvector v1 = | 


H . Let’s check if the first one is true: 


wf 9BI-B)=+E)- 


Therefore \, = 2 is an eigenvalue with corresponding eigenvector v, = a ‘ 


eigenvalue with corresponding eigenvector v2 = | 


Exercise 16.2 
Confirm that Az = 3 is an eigenvalue with corresponding eigenvector v2 = H by computing the 


product Av> and expressing this new vector as a multiple of v2. 


SSS) 


Based on this example, we can heuristically guess that the eigenvalues of an n x n diagonal matrix are 
the entries on the diagonal. The n eigenvectors each have a single 1 in them, with the remaining entries 
being zero. 





Exercise 16.3 


What are the eigenvalues and eigenvectors of the following diagonal matrices 
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Exercise 16.4 


What is one eigenvector of the following rotation matrix? 


cos@ OQ. sin@ 
18 = 0 1 0 
—sin@d 0 cosé 
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Solution 16.1 


av = [Ya] =2v 


and so the corresponding eigenvalue is \ = 2. 


1. Compute that 


2. As we can see in the picture below, both v and Av point in the same direction, which confirms 
v is an eigenvector of A. Since Av is twice as long as v that confirms that the eigenvalue is 2. 


y 


3. As we can see in the picture below, u and Au point in different directions, so u is not an 
eigenvector of A 


Solution 16.2 


wf E-B8l-» 


Solution 16.3 


We compute that 


1. The eigenvalues are \y; = —3, Ag = —1 and A3 = 4 and the corresponding eigenvectors are 
1 0 0 
vi = {0}, vo = |1} andv3 = j0}. 
0 0 1 
2. The eigenvalues are \j = 2, Ag = 4 and A3 = 0 and the corresponding eigenvectors are 
1 0 0 
vi = |0], v2 = | 1] andv3 = |0]. 
0 0 1 


Solution 16.4 
0 


The vector v = |1] is an eigenvector of R because it is the rotation axis and therefore remains 
0 


unchanged on rotation. 


Chapter 17 


Homework 6: Eigenvalues and 
Eigenvectors 





Contents 
17.1 Calculating Eigenvalues and Eigenvectors of Matrices ..............4. 147 
17.2 Properties of Eigenvalues and Eigenvectors ........0. 2.0000 ee eee wees 151 
17.3 Eigenvalues and Eigenvectors using MATLAB ...........00 0000000 152 
17.4 Eigenvalues and Eigenvectors in Data Analysis... .........0. 0000004 152 
27.5 Diaginostic Quiz . 66 6 we eee ee 155 





? Learning Objectives 
Concepts 
+ Compute the eigenvalues and eigenvectors of a 2 x 2 matrix by hand 
+ Compute the eigenvalues and eigenvectors of ann x n matrix using MATLAB 
« Describe the geometric meaning of eigenvalues and eigenvectors 
« Use eigenvectors to compute and interpret directions of variation in data 
MATLAB skills 
+ Compute the eigenvectors and eigenvalues of a given matrix 


- From a given dataset, set up the relevant matrices and compute the covariance matrix of the 
dataset. 


What is this about? The big ideas of this assignment are eigenvectors and eigenvalues. Recall that when 
you multiply a vector by a matrix, the resulting vector usually points in a different direction. An 
eigenvector of a square matrix is a vector which does not change direction when multiplied by that 
matrix. It can only change in length. The eigenvalue corresponding to this eigenvector is the scale 
factor that is applied to that eigenvector as a result of the matrix multiplication. Therefore, the 
eigenvector of a matrix points in a special direction — its a direction that is not modified by the linear 
transformation associated with that matrix. This is an idea that we will keep coming back to in a 
number of different ways throughout QEA (including next semester). The ideas contained here can be 
applied in many ways (many of which we won’t get to until next semester) such as 


« Directions of greatest variation in data. 
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« Natural co-ordinates of systems. 
+ Frequency response of filters. 


« Analysis of dynamical systems. 
Reference Material Here are some videos and tutorials that may help you understand this material. 


- Eigenvalues and Eigenvectors by 3Blue1Brown (watch first 14 mins) 
« Paul’s Online Notes. Review : Eigenvalues and Eigenvectors 
+ Intro to eigenvectors by PatrickJMT 


« Calculating eigenvalues and eigenvectors of a 2 x 2 matrix by PatrickJMT. 


17.1 Calculating Eigenvalues and Eigenvectors of Matrices 


Recall from class that \ is an eigenvalue of a matrix A with corresponding eigenvector v if Av = Av. 
Geometrically, this means that the matrix A doesn’t change the direction of v, it simply scales it by a factor 
of X. 

Given a square matrix, how can we find its eigenvalues and eigenvectors? In class, we calculated these by 
hand for the special case of diagonal matrices, and now we will move to generic 2 x 2 matrices. For general 
square matrices which are larger than 2 x 2, we will use MATLAB’s eig to compute the eigenvalues and 
eigenvectors. 


Finding eigenvalues 


So far we’ve dealt with matrices for which it is possible to think your way to the eigenvalues. For general 
matrices, this is rarely the case, and we need a method that is foolproof. The method most widely adopted 
involves the determination of an algebraic equation for the eigenvalues, usually known as the characteristic 
equation. For this reason, eigenvalues are often known as characteristic values. 

Let’s start with an example. Consider the matrix 


18 —2 
kel 7 


The definition of an eigenvalue and eigenvector imply that we are seeking \ and v which satisfy 


Av = Dv. 
We subtract Av from both sides 
Av —Av=0 
and then factor the left hand side to give 
(A — \I)v =0 


Notice that an identity matrix I has appeared out of nowhere - this simply allows us to write the vector v as 
Iv so that we can factor out the matrix A — AI. Also notice that this new matrix is just A with \ subtracted 
from the diagonal terms. For this example we have 


iat. =o 
A-at=| ae 5) 


We are only interested in v that are nonzero, i-e., v is not the vector of all zeroes. (This is because v = 0 
is always a solution to Av = Av for any A and any J, so it’s not very interesting or informative.) Assuming 
v is nonzero implies that the matrix (A — XI) is not invertible. Why? If (A — AI) were invertible, then we 


could rearrange the equation to get 


v =(A—AI)'0=0 
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which contradicts our assumption that v 4 0. THerefore, (A — AI) is not invertible. 
Since (A — AI) is not invertible, it must have determinant zero. In other words, 


det(A — AI) = 0. 
In our example, this implies that 
det(A — AI) = (18 — A)(7— A) + 24 = 0. 
This is called the characteristic equation: 
(18 — A)(7— A) + 24 =0 


or, rearranged, 


dA? — 25 + 150 = 0. 


The characteristic equation is a polynomial with the variable \ that arises by setting the determinant of 
(A — AI) equal to zero. The solutions to this polynomial give the eigenvalues . In our example, the 
polynomial can be factored 


(A — 15)(A— 10) =0 


so that gives eigenvalues A; = 10 and Az = 15. (We could use the quadratic formula if necessary.) 

Let’s retrace our steps: If \ is either 10 or 15, then the determinant of (A — XI) is zero. This implies 
that (A — AI) is not invertible, so we can look for nonzero solutions v to (A — AI)v = 0 and those v are 
eigenvectors associated to the eigenvalue 4. 

In summary, here’s the general procedure for finding the eigenvalues of a matrix: 


1. Rearrange Av = Av to get (A — AI) = 0. 
2. Compute the determinant of (A — AI). 


3. Since the matrix is not invertible, we set that determinant equal to zero: det(A — AI) = 0. This gives 
a polynomial in A, known as the characteristic equation. 


4. Solve the polynomial for the roots \. Those are the eigenvalues. 


Exercise 17.1 


1. You already know that the eigenvalues of a diagonal matrix are just the entries on the diagonal. 
Using the above procedure, confirm that 


has eigenvalues A; = 2 and Ag = —3. 
. Notice that one of the eigenvalues is positive and one is negative. The eigenvector associated 


with A; = 21s v7 = ‘ 


1 
Vi, V2, Avi and Av. What affect does the negative sign in the eigenvalue have? In other 
words, what is the difference between a negative and positive eigenvalue? 


and the eigenvector associated with jy = —3 is v2 = HF Plot 





It’s worth noting that eigenvalues come in more flavors than positive or negative. They can also be 
complex numbers. For now, will focus on matrices with real eigenvalues, but if you’re curious about the 
complex case, you can learn about it in this worksheet (ignore the first page). 
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Finding Eigenvectors 


In the example in the previous section, we discovered that the eigenvalues of 


18 —2 
A=|p | 


are \; = 10 and \2 = 15. How do we find the corresponding eigenvectors v1 and v2? 

First, let’s find the eigenvector corresponding to \; = 10. Remember that we knew \, was an eigenvalue 
because it solved the characteristic equation, i.e., det(A — A I) = 0. This is important because it implies 
(A — A1I) is non-invertible, and therefore, there exists a nonzero vector v; such that (A — AyI)v, = 0. But 
it’s not enough just to know that such a vector exists, we want to know exactly what it is. 

In our running example, this means we are looking for v1 such that 


18-10) 29 a. = 
oes | 12 °«7- .o —— iM | wey 


Let’s write v1 in terms of its unknown components 


v= (f. 
[Ie 3} [= bd. 


8a — 2b = 0 and 12a — 3b = 0. 


to get the matrix equation 


This gives us two equations 


But notice that these equations provide the same information: they both imply that b = 4a. This is because 
(A — 411) is not invertible, so the rows are linear dependent. The system of linear equations implied by 


a 


(A — A: D)vi = 0 has infinitely many solutions of the form for any a. Letting a = 1, we get v} = 2 : 


4 
: 5 Piss : : 
If we let a = 5, we would have the eigenvector 20° This hints at an important fact about eigenvectors: 
we only care about an eigenvector’s direction, not its length. So we could have chosen vj to be any vector 


2 
8 


corresponding to an eigenvalue, but only the direction of the eigenvector is unique, not the length. 


pointing the same direction as | (such as 20] or fe p. We often speak about “the” eigenvector 


Exercise 17.2 


We can always check that A; and v, are the corresponding eigenvalue and eigenvector for the matrix 
A by plugging them into the equation Av; = A; Vv and verifying that it holds. 


1. Use this procedure to check that A; = 10 and v, = 2 are the corresponding eigenvalue and 


Sa, 
I oe 


eigenvector for A = | 


—SEESe) 


Exercise 17.3 
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Using the basic eigenvalue/eigenvector equation 
Av =v 


show that if v is an eigenvector for A, then cv is also an eigenvector for , where c is any constant. 


Exercise 17.4 
, : 1 
In our ongoing example we chose a = | so that the eigenvector is v; = | Al . But we now know that 
we could scale this vector to be any length. A very common standard is to normalize eigenvectors so 
that they are unit length, ie. their length should be 1. 


: ; 1 
1. Normalize the eigenvector v; = | if 


| so that it has unit length. 


Exercise 17.5 


18 —2 
Ee a 
find the eigenvector that corresponds to the eigenvalue Az = 15, and then normalize it so that it has 
unit length. 


Continuing the example above, with 


Exercise 17.6 


Determine the eigenvalues and eigenvectors of the following 2x2 matrices. Normalize the eigenvec- 
tors. 


ibe 





Exercise 17.7 
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We have two vectors, 


(17.2) 


In other words, the vectors n and z point in a very similar direction, but are not perfectly aligned. 
Now consider a matrix S given by 


2% Il 
ea 
. On the same axes, plot the vectors n and z using MATLAB. 


. Suppose that n and z are transformed by S. On the same axes as in the previous part, plot the 
vectors Sn and Sz using MATLAB. 


. Now, we shall see what happens to these vectors under repeated transformations by S. On the 
same axes as in the previous part, plot the vectors SSn and SSz using MATLAB. 


. On the same axes as in the previous part, plot the vectors SSSn and SSSz using MATLAB. 


. On the same axes as in the previous part, plot the vectors SSSSn and SSSSz using MATLAB. 


. You should find that n is unaffected by the transformation by S, but z on the other hand moves 
farther and farther away. In other words, under repeated transformations by S, z grew further 
and further apart from his four friends. Explain what you see in terms of eigenvalues and 
eigenvectors. 





17.2 Properties of Eigenvalues and Eigenvectors 


Consider an n x n matrix A. The characteristic polynomial will be a polynomial of degree n in J, ie., it will 
have the form 
Cn rA” +++ +E,A + ¢9 = 0 


where c; are constants. This polynomial will have n roots, although some of those roots might be the same 
(e.g., both roots of the polynomial \? + 2\ + 1 = 0 are —1, so we say \y = —1 and \2z = —1.) Since the 
eigenvalues are the roots of a polynomial then it is possible that some of them will be complex, and their 
corresponding eigenvectors would be complex too. 

The following are key properties of the eigenvalues and eigenvectors (some of these are n-dimensional 
extensions of what you already saw for 2 dimensions). 


» Ann X n matrix has n eigenvalues Aj, A2,--- , An, where it is possible that some eigenvalues are 
equal or complex. 


- If the eigenvalues are distinct (none are equal) then the corresponding eigenvectors are linearly 
independent. 
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- If a matrix is symmetric, ic, A = A”, then its eigenvalues are real and its eigenvectors are orthogonal 
to each other. 


17.3. Eigenvalues and Eigenvectors using MATLAB 


While most of our work on eigenvalues and eigenvectors has focused on 2D vectors and 2 x 2 matrices, 
these ideas extend to higher dimensions as well. The eigenvalues and eigenvectors can be found by solving 
the characteristic polynomial, which quickly gets out of hand or impossible as the size of A increases. We 
can instead use the MATLAB eig function. 

A few words about eig are in order. The following command 


>> [V,D] = eig(A) 


will return two matrices. The columns of the matrix V are the eigenvectors. D is a diagonal matrix, with 
the eigenvalues on the diagonal. The first eigenvector is in the first column of V and has a corresponding 
eigenvalue in the first diagonal entry of D. Each eigenvector is normalized to have a length or magnitude of 
1. The eigenvalues will often "appear" to be sorted according to their size, but this is not necessarily true, 
and is simply an artifact of the algorithm used to compute them. See the documentation in MATLAB for 
more details. 


Exercise 17.8 


Use MATLAB’s eig function to get the eigenvalues and eigenvectors of the following matrices, and 
compare to your results from earlier exercises. 


18 4 


1. kelp 7 


AL 
2 a= (1 ; 


Pay ate 





17.4 Eigenvalues and Eigenvectors in Data Analysis 


In the last class you worked on examples involving correlation matrices. Here we will look at covariance 
matrices, which are related to correlation matrices, except that the entries are not normalized by the standard 
deviations of the variables. You can think of covariance matrices as measuring the relationship between 
random quantities, but without normalization. Thus, information about how small or large these data values 
are will still be preserved in the covariance matrix. 

Suppose that we have two different data variables x and y (e.g. corresponding to temperatures in Boston 
and Sao Paolo), with x; and y; being different values in the data set we can define a a matrix A as follows: 


T1— Max Yi — by 

T2— Mar Y2— by 

A = ——— | 7% Hx 93 — My 
N-1 : ; 
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where ji, is the mean of the first column, and N is the number of samples (rows). The covariance matrix of 
x and y is R = A? A. You can think of the entries of this matrix as storing the un-normalized correlations 
between the temperatures. Because R” = R, this matrix is symmetric, and hence its eigenvalues are real 
and it has orthogonal eigenvectors. 

The eigenvectors and eigenvalues of R tell us something about how the data are distributed. The 
eigenvector corresponding to the largest eigenvalue of R, which is also called the principal eigenvector 
of R points in the direction with the largest variation in the data. The eigenvector corresponding to the 
second largest eigenvalue points in the direction orthogonal to the principal eigenvector in which there is 
the second largest amount of variation in the data, and so on (if you have more than z dimensional data). The 
square-root of the eigenvalues tells you about the amount of variation there is in each of those directions. 
Of course when you only have two different variables in the data set, the matrix R has only 2 orthogonal 
eigenvectors. 

To illustrate, consider Figure 17.1 which shows the centered (mean subtracted) temperatures of Boston 
vs Sao Paolo. We have also plotted the two eigenvectors, scaled by the square-root of their corresponding 
eigenvalues, to illustrate the relative variation of the data along the directions of the two eigenvectors. Notice 
that the principal eigenvector is in the direction of greatest variation in the data. Figure 18.1 is a similar plot 
with the temperatures of Boston and Washington DC instead. 
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Figure 17.1: Centered average daily temperatures of Boston vs Sao Paolo, with the eigenvectors of the 
covariance matrix. 
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Figure 17.2: Centered average daily temperatures of Boston vs Washington DC, with the eigenvectors of the 
covariance matrix. 


Exercise 17.9 


In this next problem, we are going to visualize how the eigenvectors of covariance matrices can tell 
us about the directions of most variation in 3D data. Load the file temps_bos_sp_dc.mat in 
MATLAB (this file can also be downloaded from the Canvas page for Homework 6). This file will load 
21 years of temperature values for Boston, Sao Paolo and Washington DC. Treat the temperatures of 
Boston, Sao Paolo, and Washington DC for a given day as a point in a 3D space. 


1. Subtract out the mean temperature of each city from the daily temperature data. 


2. Make a 3D scatter plot of the data points with the means subtracted out. You will find 
MATLAB’s plot3 function useful. You may wish to use the *MarkerSize’ argument 
for plot3 with a marker size of 0.1 or less to make the plots clearer. 


. Construct a covariance matrix for the data and compute its eigenvectors. 


. On the same axes, using quiver3, or plot3, plot the eigenvectors scaled by the square-root 
of their corresponding eigenvalues. Use grid on to draw grid lines on the axes to improve 
your visualization. 


. Using the rotate 3D button on the figure window, rotate the image around to see how the 
eigenvectors tell you about the variation in the data. 
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See) 


17.5 Diagnostic Quiz 


Please see Canvas for the quiz questions. 
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Solution 17.1 


1. First we find 


A-AT= 0 a. 8 





F=) eal 


And then we compute the determinant 
det(A — AI) = (2 — A)(—3 — A). 
Setting this equal to zero produces the characteristic equation, 
(2— A)(—3 — A) =0 
whose roots are, in fact, Ay = 2 and Ag = —3. 


2. Here’s a plot of v1, v2, Avi and Avo: 


Av2 





+ 


When the eigenvalue is negative, the eigenvector is reversed in direction and then scaled. 


Solution 17.2 


1. First we compute the left-hand side 
18 —2] }1 10 
oo Fe | H ~ io 


1 10 
ALVi = 10 Hl = to 7 


and the right-hand side 


Fortunately, they are equal. 


Solution 17.3 
Using the fact that Av = Av, we see that 


A(cv) = cAv = cv = X(cv) 
and therefore cv is also an eigenvector. 


Solution 17.4 


1. We normalize v, by first finding its length and then constructing a unit vector. The length of 


v1 is given by 
lvall = V1?4+44=V17 
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Now we construct the unit vector V1 by dividing v, by its length 








% = V1 _ W 17 
‘(all [4/v17 


We often use the“ symbol to denote a vector that has unit length! Also, notice that even this 
=A ii|. 
—4/V/17 


eigenvector is not unique because we could multiply it by —1 to get the vector | 


all we’ve done is flip the vector by 180 degrees. 


Solution 17.5 


18-15 —2 3-2 
A-ut=| 12 7 a6| = ze 


First we compute 


Now, letting v2 = Hl , we are trying to solve 
3. —2] Ja} _ |0 
12 —8| |b} |O}’ 


3a —2b=0 and 12a — 8b=0. 


which produces the equations 


(These equations give the same information since the rows of (A — \2I) are linearly dependent.) 


3 


This gives b = 3a, SO V2 = ‘ for any value of a. Picking a = 2, we have vz = | . To normalize 
2 
it we need it’s length which is 
val] = V2? + 32 = V13 
so that the unit eigenvector is 
6 2/V13 
2 13/18 


Solution 17.6 





Ay = 5,A2q = 2 
and 
y, - (2/¥5]_y. — [-/v? 
BS heal? > | 
2. 
Ay = —-1,A2 =4 
and 





= ee) Eivaal 


Solution 17.7 


See Figure 17.1. n is an eigenvector of S with an eigenvalue of 1, so it is unchanged by the transfor- 
mation S. However, z is not an eigenvector of S, so it changes each time the transformation S is 
applied, and the change accelerates as it diverges from the eigenvector n. 
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1.45 T 1 1 1 1 1 T 1 


a | 
1.35 p SSSSz 1 


1.37 1 





4.254 SSSz 4 


1.2+ SSz | 


Coordinate 2 





n, Sn, SSn, SSSn, SSSSn 


1 1 1 n L H 4 


-1 -0.95 -0.9 -0.85 -0.8 -0.75 -0.7  -0.65 -0.6 -0.55 
Coordinate 1 











Solution 17.8 
1.>> A = [18 -2;12 7] 


A = 


18 -2 
12 7 


>> [V,D] = eig(A) 


V — 
0.5547 0.2425 
0.8321 0.9701 
D = 
15 0 
0 10 
You will notice that the eigenvectors are numerical approximations to the exact ones found 
earlier. 
2 >> A= [4 231 3] 
A = 
4 2 
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>> [V,D] = eig(A) 


V = 
0.8944 -0.7071 
0.4472 0.7071 
D= 
5 0 
0 2 
Great! 


3. >> A= [1 233 2] 


1 2 
3 2 


>> [V,D] = eig(A) 
V = 


-0.7071 -0.5547 
0.7071 -0.8321 


-1 0 
0 4 


You should notice that the eigenvector corresponding to the eigenvalue of 4 has two negative 
entries instead of two positive ones - this is absolutely fine - it has been "flipped" by 180 degrees. 
You should not expect the result from MATLAB to match all of your signs, but they should 
match all of the relative signs! 


Solution 17.9 

1. bn=b-mean(b) ;sn=s-mean(s) ;wn=w-mean(w) ; 

2. plot3(bn,sn,wn,’.’, ’MarkerSize’ ,0.1) 
xlabel(’Boston temperature (mean subtracted) ’) 
ylabel(’Sao Paolo temperature (mean subtracted) ’) 
Zlabel(’Wash. D.C. temperature (mean subtracted)’ 

3. A=1/sqrt(length(b)-1)* [bn, sn, wn]; 
R=transpose(A)“*A; 

[V,D]=eig(R) 

4. plot3(bn,sn,wn,’.’, ’MarkerSize’ ,0.1) 

Vs=V. *sqrt (diag(D) ) 
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hold on 

plot3([0,Vs(1,1)],[0 Vs(2,1)],[0 Vs(3,1)], ?LineWidth’, 2) 
plot3([0,Vs(1,2)],[0 Vs(2,2)],[0 Vs(3,2)], ?LineWidth’, 2) 
plot3([0,Vs(1,3)],[0 Vs(2,3)],[0 Vs(3,3)], ?LineWidth’, 2) 
grid on 

axis equal 


See Figure 20.1, which has the first eigenvector clearly aligned with the direction of greatest 
variation. 
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Figure 17.3: Temperatures and eigenvectors. 


Chapter 18 


Week 7a: Eigenvalue Decomposition 
(EVD) 





Schedule 
48.4 Debrief [15 mins] . . 0266-56 ee eR ee Eee 161 
18.2 Eigenvalue Decomposition (EVD) [45 mins]. ............00. 0000004 161 
18.3 Introduction to Principal Components Analysis [15 mins] .............. 163 





18.1 Debrief [15 mins] 


In the last class and in the homework exercises, you worked on a number of different exercises involving 
eigenvalues and eigenvectors. Try to resolve your confusions with the folks in your breakout room and by 
talking to an instructor. 


18.2 Eigenvalue Decomposition (EVD) [45 mins] 


The eigenvalue decomposition, also known as the eigendecomposition, is an operation on matrices in which 
a square matrix is expressed as a product of matrices made up of its eigenvalues and eigenvectors. It can be 
used to find inverses and powers of matrices, as well as to derive some important results in data analysis. 
For instance, in a prior exercise, you saw that the eigenvector corresponding to the largest eigenvalue of a 
covariance matrix was in the direction of greatest variance in your data set. This property can be proved 
using the eigendecomposition. 

The eigenvalue decomposition is also helpful in dimensionality reduction, which is a process where 
we can represent higher-dimensional vectors as a linear combination of a smaller number of vectors than 
dimensions — an example of which you saw in a previous exercise where you represented pictures of peoples 
faces using a linear combination of vectors. The eigendecomposition is also often used to change coordinate 
systems. 


The Big Idea 


Assume that a square n x n matrix A has n linearly independent eigenvectors v; with corresponding 
eigenvalues Aj, i.e. 
Av; = Avid = een) 


Instead of thinking of these eigenvalues and eigenvectors separately, let’s package them into matrices as 
follows: 


[Avy Av2 one Av,,| = [Aiv1 A2V2 eas AnVn] 
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Properties of matrix multiplication suggests that we can re-write this matrix equation in the form 


A, O 0 
Alvi v2... Vn] = [vi V2 --- Vn] 0 wats 0 
0 O AD 


where the last matrix has each eigenvalue on the diagonal. If we now define 


V= [vi VQ... Vn 
Ay O 0 
D = 0... 0 
0 O XrAn 
then the previous equation becomes 
AV =VD 


Since we assumed that the eigenvectors are linearly independent this implies that the columns of V are 
linearly independent which in turn implies that the inverse of V exists. We can therefore write 


A=VDv"! (18.1) 


where the matrix V has the i-th eigenvector of A as its 7-th column, and D is a diagonal matrix with the 
i-th eigenvalue of A as its i-th entry. This expression is known as the eigendecomposition of A. 

This expression is the same for any square, invertible matrix. An eigendecomposition tells you that the 
original matrix is °composed’ of an eigenbasis with associated eigenvalues in D. There may come a time 
when it is more convenient to work in A’s eigenbasis and then transform the result. 

In the special case where A is symmetric, the eigenvalues are real, and the eigenvectors are mutually 
orthogonal so that 


vol=avt 


which is a property of n x n matrices whose column vectors are mutually orthogonal and have a length of 1 
(i.e., the column vectors are orthonormal). 


Exercise 18.1 


Read through "The Big Idea" about the eigenvalue decomposition, being careful to fill in the gaps. 
What are you confused about? 


Exercise 18.2 


Consider the following 2 x 2 matrix A. 


il 
a=[t 
By hand, compute its eigenvectors and eigenvalues, determine the matrices V, D, and V~!, and 


confirm that (18.1) is correct. Use MATLAB to confirm your results by computing » [V,D]=eig(A). 
Note: you should normalize each of your eigenvectors to be unit length. 
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Exercise 18.3 


The eigendecomposition can be used to change basis as follows. Consider the matrix A from the 
previous exercise as a transformation matrix. 


. How does the matrix A transform the vector w = | ? Draw both w and Aw on an 


xy-coordinate plane. 
. Draw both eigenvectors of A on this coordinate plane. 


. Decompose the vector w as a linear combination of both eigenvectors. You should be able to 
do this with a matrix-vector multiply. You are expressing the vector in a new basis. 


. Scale each component by the relevant eigenvalue. 
. Undo the decomposition to return to the original basis. 


. What just happened? 


Exercise 18.4 


One thing that the eigendecomposition helps us compute is how to raise A to an integer power, 
without going through the process of repeated multiplication. 


1. Using eigendecomposition, show the following is true 
AD Va (18.2) 


and confirm this result using the matrix from the earlier exercise. Note that for any diagonal 
matrix D, D* is another diagonal matrix whose ii-th entry equals the i-th entry of D raised 
to the k-th power. Hence computing D” is not computationally difficult - you just raise each 
diagonal entry to the n-th power. 


2. Show that the following is also true 


A ND Vas 


SESE 


18.3. Introduction to Principal Components Analysis [15 mins] 





In a previous assignment you explored, in a graphical manner, the relationship between the eigenvectors of 
the covariance matrix and the distribution of the data. For instance, you looked at the daily temperature 
values in Boston versus Sao Paolo and the daily temperatures in Boston versus Washington D.C. 
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Figure 18.1: Centered average daily temperatures of Boston vs Sao Paolo (left) and Boston vs Washington 
DC, with the eigenvectors of the covariance matrix. 


From visually inspecting these figures we saw that eigenvector 1, which corresponded to the larger of 
the two eigenvalues, seemed to be pointing in the direction where the data exhibited the most variability 
(ie., the data was most spread out along this direction). You also looked at this for a 3D dataset consisting of 
the temperatures from Boston, Sao Paolo, and Washington DC. 
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Figure 18.2: Temperatures and eigenvectors for Boston, Sao Paolo, and Washington DC 


In this 3D dataset, we see the same phenomenon: that the principal eigenvector points along the direction 
of maximum variation in the data. It turns out that this phenomenon will hold no matter the dimensionality 
of the data (it works for 4D datasets, 10D datasets, and even datasets with 1,000s of dimensions)! This fact 
provides the basis for principal components analysis (PCA). In PCA, instead of working with the data in its 
original form, we express it in a basis given by eigenvectors of the covariance matrix that have the largest 
eigenvalues. We can understand the properties of using this basis through two key properties. 


+ Property 1: the principal eigenvectors of the covariance matrix will maximize the variance of the data 
when the data is projected onto these vectors (we can think of vectors that capture large variation in 
the data as representing important properties of the data). 


+ Property 2: the principal eigenvectors of the covariance matrix will allow us, in a particular sense, to 
optimally compress our data. That is, we will be able to recover the original data with the highest 


possible accuracy from the projections of the 


data onto the principal eigenvectors. 
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The power of PCA lies in its ability to achieve both of the properties described above simultaneously. 
For this reason, the principal components of a dataset will act as keys to unlocking the secrets lurking in the 
data! 


Exercise 18.5 


Discuss why it makes sense to project data onto directions in which there is most variation in the 


data for both data compression and face recognition. 
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Solution 18.2 
—1 


| and 


The eigenvalues are \; = 1 and Ay = 3 with corresponding eigenvectors v; = a | 


Vo = a Hl . This gives 


ar Ce oe ae 


and if you multiply them all together you will get the original matrix A. Running "eig" in MATLAB 


gives the same eigenvalues and eigenvectors, although every eigenvector could be multiplied by —1. 


MATLAB may also place your eigenvalues and eigenvectors in a different order. 
Solution 18.3 


1. The vector becomes A 


2. The eigenvectors were v1 = Wi | and v2 = wa iP 


3. Decomposing the vector w as linear combination of the eigenvectors is equivalent to solving 
Vce=w 


for the vector c. This is the coordinates of the vector w in the new basis. You should find that 


—0.5 
re 


4. We multiply the first component by 1 and the second component by 3 to give 2 ae ; 


5. In order to undo the change of basis we hit this vector with V which gives 2 as expected. 


4 


6. The eigendecomposition can be be thought of as a change of basis followed by a scaling matrix 
followed by the change back to the original basis. 


Solution 18.4 


1. Since A = VDv"!, we know that 


A? =VDV_!VDV"! = VD?v"!. 


2. Similar reasoning to the previous problem shows that 


A" =VD"v"! 


Solution 18.5 


In data compression, we project data onto orthonormal vectors which point in directions with the 
most variation in the data. The coefficients from the projections tell us exactly how much of each 
component (i.e. the eigenvectors) makes up a given data point. Therefore we focus on the directions 
with most variation, because the remaining directions don’t tell us much. 

In face recognition, we are trying to distinguish between different images. So it makes sense to focus 
on directions in which there is more variation between the images. 
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Week 7b: Principal Components 
Analysis 





Schedule 
19.1 Principal Eigenvectors as Directions with the Largest Variance [30 mins] .... . 167 
19.2 Applications of PCA (thinking it through conceptually) [45 mins] ......... 169 





19.1 Principal Eigenvectors as Directions with the Largest Vari- 
ance [30 mins] 


PCA at a high level 


Recall the following properties related to PCA that we saw at the end of last class: 


« In many applications, being able to reduce dimensionality of data is extremely helpful as it can lead to 
efficient representations and reduced computational complexity. 


+ Given some data, PCA enables us to express multidimensional data as a linear combination of or- 
thonormal vectors, starting with the vector in the direction with most variation in the data. The next 
vector will be in the direction with most variation of all directions orthogonal to the first, and so on. 


* So, if we want to work with a lower-dimensional representation of our data, we can focus on those 
directions that contain the most variation. 


+ The eigenvectors of the covariance matrix of the data are the principal component vectors. The 
eigenvector corresponding to the largest eigenvalue lies in the direction with most variation in the 
data set. The eigenvector corresponding to the second largest eigenvalue lies in the direction with the 
next most variation, of all directions orthogonal to the first eigenvector, etc. 


The Principal Eigenvector as the Direction of Maximum Variance 


The graphs of the daily temperature data show, graphically, that the principal eigenvector of the covariance 
matrix corresponds to the direction of maximum variation in the data. In this section we'll be formalizing 
this result. We’ve decided to structure this part of the day assignment as an extended exercise where you 
will be working through the proof of this fact step-by-step. While there are many ways to do this proof, 
we'll be walking you through one way that will connect well with the ideas we’ve been exploring in the last 
week or so of the course. We recommend that you do a part of the proof, check it against the solutions and 
then move onto the next piece. 
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Before getting started, let’s look at some material from night 6 that shows that the covariance matrix can 
be computed using matrix multiplication. 


Suppose that we have two different data variables x and y (e.g. corresponding to temperatures 
in Boston and Sao Paolo), with x; and y; being different values in the data set we can define aa 
matrix A as follows: 


T%]—He Y1— by 
X2 — Ue Y2 — Py 


A = ——— |] 73 He 93 My 19.1 
N—1 , (9.1) 


where ji is the mean of the first column, and N is the number of samples (rows). The covariance 
matrix of 2 and y is R = A7 A. You can think of the entries of this matrix as storing the un- 
normalized correlations between the temperatures. Because R? = R, this matrix is symmetric, 
and hence has orthogonal eigenvectors. 


Let’s assume that we are given a dataset with n samples and d dimensions (instead of just 2 dimensions 
as shown above). We can transform it into the form given in Equation 19.1 by subtracting the mean from 
each column and dividing the entire matrix by /.N — 1. We now have a mean-centered data matrix A with 
n rows and d columns and the covariance matrix of our data is given by A' A. 


Exercise 19.1 


Our overall goal is to show that if we take a unit vector u, project our mean-centered data onto it 
(as Au), and examine the variance of the projected data, that this variance is largest when u is the 
principal eigenvector of the covariance matrix A' A. 


1. First we'll write down an expression for the variance of Au (we'll write this as Var[Au)) as a 
matrix multiplication. We'll do this step together (i-e., we'll show you how to do it). For this 
part of the exercise you should make sure you understand the steps we performed. 


If A is in the form given in Equation 19.1, then Au will have o mean (since Au is a linear 
combination of columns with o mean). Using the same logic that led us to conclude that 
A' A is the covariance matrix of the data, (Au) ' (Au) will give us the variance of the data 
projected onto u (remember that variance is just a special case of covariance where we are 
comparing a quantity to itself). It’s worth noting that since Au is a vector, the expression 
(Au)! (Au) is known as the inner product, which is really the same as the dot product (that 
is, (Au)'(Au) = Au- Au). Thus, the variance is given by the following equation. 


Var[Au] = (Au) | (Au) 
=u!'A'Au__ note: we are applying the rule that (AB)' = B' A! 


. Substitute the eigenvalue decomposition, VDV ', for the covariance matrix A'A (since 
A'A is symmetric and real, we can substitute V' for the inverse of V in the eigenvalue 
decomposition). 


. Define the vector y = V' wand substitute it into the expression from part 2. 


. Expand out the expression in part 3 so that it is in terms of the squares of the elements of y 
and the diagonal entries of D in order of largest to smallest. 
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5. Show that y is a unit vector by taking the inner product with itself and showing that it is equal 
to 1 (recall that the inner product is the same as the dot product). Hint: VV' =I since V is 
orthonormal and has d linearly independent columns. 


. Argue that since y is a unit vector (which implies )>‘_, y? = 1), that the expression in part 4 
is maximized when y; = 1 when 2 is the index of the principal eigenvector and y; = 0 when 
i is any other index. To get a feel for why this is true, try writing out a specific case where, 
perhaps, y has two or three dimensions. 


. Show that we achieve the value of y in part 5 (that is where y; = 1 when 7 is the index of the 
principal eigenvector and y; = 0 when? is any other index) when u is the principal eigenvector 


of A'A. 


. What have you just shown?!? Make sure you have a sense of what you just did (don’t get lost 
in the mathematical symbols). 





Beyond the first principal component 


We’ve now gone into depth in understanding the first principal component and its amazing property of 
maximizing variance. The second principal component is simply going to be the direction that maximizes 
variance subject to the requirement that it is orthogonal to the first principal component. With a slight 
modification to your proof you can show that the second principal component will be in the direction of the 
eigenvector with the second largest eigenvalue. The trend continues for other principal components (i.e., the 
ith principal component is the eigenvector with the 7th largest eigenvalue). 


19.2 Applications of PCA (thinking it through conceptually) [45 
mins ] 


In this section you’re going to be thinking about what the PCA algorithm might do when applied in different 
domains. The focus of this section will be on trying to understand at a conceptual level what might happen 
when we apply PCA. In the next section, you'll be reading through an example of applying PCA to some 
actual data. 


Exercise 19.2 


For each application, hypothesize what the first principal component might be. That is, for each 
particular scenario what would the direction be that maximizes the variance of the data projected 
onto that direction? What might the second principal component be (that is a vector orthogonal to 
the first that maximizes the variance of the data)? 


1. Consider a dataset consisting of ratings from n users of m movies. Let’s assume that the 
ratings are numerical and are on a scale of 1 to 5 (5 being the best). Consider some collection 
of movies (they could be some specific movies or you could just think of movie genres) and a 
particular population of users (could be college students, QEA professors, or just the general 
population). Draw the data matrix A and label the rows and columns (e.g., with movies or 
users). In a qualitative sense, make a prediction as to what the first principal component would 
look like for this dataset. What might the second principal component look like? No numbers... 
just guess at which dimensions would be positive, negative, or close to o for your principal 
components. 
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2. Consider a dataset consisting of the prevalence of the flu in various parts of the US. The CDC 
maintains an animated map of the flu activity over time, which you can (and should) access 
at https: //www.cdc. gov/flu/weekly/usmap. htm. To simplify this data, let’s 
think about the number of flu cases in each of the six major major geographical regions of the 
US. 


CANADA 


If we think about our data matrix as consisting of a row for each week of measured flu activity 
and each column as a region of the US, in a qualitative sense, make a prediction as to what the 
first principal component would look like for this dataset. What might the second principal 
component look like? No numbers... just guess at which dimensions would be positive, 
negative, or close to o for your principal components. 


Exercise 19.3 


With your table-mates, read through this post that shows the application of PCA to understanding 
the US political leanings (if you are viewing this in DropBox preview and can’t click the link, go to 
http://bit.ly/37n9qwe). Before, starting here are some process suggestions. 


* Checkin with folks at your table as to how they’d like to go through this document (e.g., read 
the entire thing individually and come together and ask questions, read it individually but stop 
after each major section to ask questions, read it aloud as a table). 


- If you don’t understand something, you can either call over an instructor or note your confusion 
on the whiteboard and keep going (e.g., if its something that doesn’t impede your understanding 
of the main points in the article). 
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Solution 19.1 


1. Solution is already given in the problem 


2. 


Var[Au] =u' VDV'u 


Var{Au] = (V'u)'D(V'u) 
=y'Dy 


Var[Au] = y' Dy 





I 
ree 
=| 





. If we choose y; = 1 where 7 is the index of the principal eigenvector, then the expression in 


part 4 will give us D;,;. Any other choice of y will result in some weighted combination of the 


eigenvalues (the diagonal elements of D) where the weights are all positive and add up to 1. 


It is easy to see that putting any weight on a non-maximal eigenvalue will result in a lower 
variance as computed by the expression in part 4. 


. Since y = Viu y; is the dot product of u and the 7th eigenvector, vi, with u. Since we 


assume all of the eigenvectors are unit vectors and mutually orthogonal, if we set u to be the 
principal eigenvector of A ' A, then the dot product of u and y; will be 1 for 7 corresponding 
to the principal eigenvector and o for all other indices. 


. You just showed that the direction along the principal eigenvector of the covariance matrix 


maximizes the variance of the projected data. That’s pretty cool! 
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? Learning Objectives 


Concepts 
« Understand the connection between PCA and eigenvalues and eigenvectors. 
« Understand how to use PCA to carry out data compression. 
+ Understand the idea of using Eigenfaces to do facial recognition. 
MATLAB skills 
+ Use “eig” to carry out a PCA. 
« Implement facial recognition using PCA. 


+ Determine the accuracy of PCA for different numbers of principal components. 


20.1 Principal Component Analysis Revisited 


As we in saw class, PCA is an algorithm in which we express our original data as a linear combination of the 
eigenvectors corresponding to the largest eigenvalues of the covariance matrix. We examined the property 
of PCA that it if we project our data onto these vectors, this will lead to maximizing the variance of the 
projected data. To refresh your memory further, here is the temperature plot for Boston, Sao Paolo, and 


Washington DC. 
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Wash. D.C. temperature (mean subtracted) 
z 


Boston temperature (mean subtracted) 
Sao Paolo temperature (mean subtracted) 


Figure 20.1: Temperatures in three cities and the eigenvectors of the covariance matrix. 


We also briefly mentioned a second property of PCA, which is that it can be thought of as an optimal way 
to compress our data down to a smaller set of numbers. This idea, also known as dimensionality reduction, 


is going to be a view that we explore in this assignment. If you’d like, here are a few external resources on 
PCA: 


* http://www.cs.otago.ac.nz/ 


+ http://dai.fmph.uniba.sk/courses/ml/sl/PCA.pdf 


https://deeplearning4j.org/eigenvector/linear 


+ http://www.cerebralmastication.com/2010/09/principal-component-analysis-pca-vs-ordinary-least-squares- 
ols-a-visual-explination/ 


+ http://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors- 
eigenvalues/ 
PCA in two dimensions 


In general, PCA is conducted on data that is mean-centered (i.e., the data has had the mean of each variable 
subtracted out). To refresh your memory of PCA and scaffold the introduction of the view of PCA as 
compressing a dataset, let’s think about a simple example data set D. 


-1 3 

1 4 
D=]3 4 (20.1) 

7 #5 

10 9 


Exercise 20.1 


1. Create a plot of D as a set of points in the xy-plane. 


2. Define a matrix D which is the mean-centered version of D and plot D as a set of points in 
the xy-plane 


3. The principal components (p; and pg) are the eigenvectors of the covariance matrix of the 
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mean-centered D. Compute pj; and py and plot them on top of the mean-centered data. 


. Compute the projection of your data onto the eigenvector which corresponds to the largest 
eignevalue, which in this case is pz. This is the “reduced dimenstionality” version of your 
data, called B, which only include information about the projection along pz. (We reduced the 
2-dimensional data to 1-dimensional data.) Plot the original data D and the reduced data B. 


Exercise 20.2 
. Can you recreate D perfectly from B? 


. What would have happened if you had created B using only information about the values 
along p, instead of pg? 


. How might you quantify how well you can represent D in this reduced dimensionality form? 


. If you received a new piece of data, how would you go about representing this as a linear 
combination of p; and p2? 





Data Compression via PCA 


In this exercise you will perform a simple data compression exercise, similar to the one you did in a previous 
homework assignment. You will use temperature data from 3 cities over 10 years, as training data and use it 
to compress a year’s worth of temperature data from 3 cities into a 2 x 365 matrix. In other words, you will 
represent 3 x 365 numbers (daily temperature data from 3 cities over 1 year), using 2 x 365 values. This 
compression is lossy, in that you will loose some information. However, by representing the data along the 
two most significant eigenvectors of the covariance matrix, you can reduce this data loss, because these two 
directions capture the bulk of the variation in the data set. Please note that while we have laid out the steps 
you need to take here quite explicitly, it is important for you to fully understand what each step does. You 
will be using very similar steps in your project. 


Exercise 20.3 


1. Load the file avg_temperatures_pt2.mat. You will have 6 data vectors in your 
workspace. b_tr, w_tr, s_tr which represent 10 years of training data for the average 
daily temperatures in Boston, Washington DC, and Sao Paolo, respectively. The vectors 
b_new, w_new, S_newrepresent an additional year of data for the three cities — this 
is the data that you will compress using statistical knowledge of the previous 10 years of 
data. Create a covariance matrix R using the 10 years worth of temperature data from Boston, 
Washington DC and Sao Paolo (in that order). 





. Perform an eigendecomposition of the matrix R, and make a new matrix V, which has the 2 
eigenvectors corresponding to the 2 largest eigenvalues of R. You should use MATLAB’s eig 
function. Let these eigenvectors be v; and v3. 


3. Create centered (i.e. subtract the mean), versions of the new temperature data vectors, and 
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create a 3 x 365 matrix T which has the centered temperatures of Boston, Washington DC 
and Sao Paolo as its rows (in that order). This matrix is a representation of the data you are 
now going to compress. Let the -th column of T be t;. 


. Take the dot product of each column of the matrix T (which is a vector of the temperature of 
Boston, Washington DC and Sao Paolo for a given day) with the two eigenvectors in matrix 
V >, and save the values. Let this quantities be called a1; and a;. In other words, 


T 
a= V1 t; 


T 
A2i = Vo t; 


You can do this using matrix multiplications. 


You should now have 365 different values for a1; and a@2;, which are a compressed representa- 
tion of 3 x 365 different temperature values. Moreover, these values are the components of 
the temperature data that lie in the directions of the two eigenvectors of the covariance matrix 
corresponding to the largest eigenvalues. From what we saw in the previous two classes, these 
vectors represent the two orthogonal directions in the data that have the most amount of 
variation, and hence the most "important" directions. Of course, there is a third direction (since 
the temperature vectors live in a 3-dimensional space), which we are discarding. But since this 
is the direction in which there is the least amount of variation in the data set, we do not lose 
too much information. 


. You can now check how well your compression worked, by using the values of a4; and a2; 
to reconstruct 365 different 3 x 1 vectors each representing the temperatures for the three 
cities over the 365 days. Let t; represent the reconstructed temperature vector on the i-th day. 
Using what you know about projections onto orthonormal vectors, reconstruct t; using a4;, 
Q2;, Vi and V2. Repeat this for all 365 days. 


. On the same axes, plot the original and reconstructed temperature for Boston. Repeat this for 
Washington DC and Sao Paolo. Observe how close the reconstructions are, for the different 
data sets. 


. How accurately do you think you can represent the data if you used 3 eigenvectors instead of 
2? 


. If you feel inspired, repeat the above with temperature data for four different cities, and 2 or 3 
different eigenvectors. 





While this example can be thought of as a “toy” example where we are representing 3 dimensional data 
using 2 dimensions, there are many applications for which there may be many more dimensions in the data 
for which accurate representations can be made using only a few dimensions. Additionally, you should note 
that such dimensionality reduction techniques are not just useful in compression, but they are also useful in 
speeding up computation. We can often get away with analyzing data over a small number of important 
dimensions, and this is an important technique when we deal with large amounts of data. Overall, these 
class of techniques is called Principal Component Analysis (PCA), since we are performing analysis along a 
few principal component directions of the data. 


20.2 Face Data Compression via PCA 


You are now ready to start applying PCA to face data. You have already seen this in a previous class assignment, 
except in that assignment you had the help of a genie. Load the MATLAB files classdata_train.mat 
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and classdata_test.mat. These are the training and test datasets with photos of your classmates. 
The file contains some images and as well as the identity of the person in each image (coded as an integer 
from 1 to 89). 

Remember that the principal eigenvectors of the covariance matrix tell you the directions of greatest 
variation in a data set and also the directions that optimally compress our data. Your job is to use the training 
images to build a model for your faces such that you can compress a many-pixeled test face image (pick one 
from the test image array) using a small set of image vectors (e.g., 10, 20, or 50). 

Before you dig into this problem, think through how you would formalize the face data compression 
as a problem that you can solve with the linear algebra techniques that you’ve learned so far. There is no 
exercise to answer, but we want you to think through these steps before going further in the assignment. 


+ How you will choose your set of image vectors? 


+ Come up with the steps needed to do the above and write some pseudo code (e.g., load the data, 
vectorize the images, etc.). 


« How could you tell whether your compression algorithm works (these could either be quantitative 
metrics, like root-mean squared error or qualitative metrics). 


20.3 Eigenfaces for Face Recognition 


It’s time to bring it all together and finally plunge into the prosopagnosia (look it up) problem (or at least 
build some facial recognition software, but alliteration is fun). You will implement the eigenfaces algorithm 
to identify photos of your classmates. While it sounds fancy, you have almost all of the pieces needed to 
understand and implement Eigenfaces (the last necessary piece you will pick up momentarily). Here are the 
major steps in the Eigenfaces algorithm. 


1. Use PCA to compute the k principal components of the training face images (the k eigenvectors with 
largest eigenvalues). 


2. Project the training and test face images onto the k principal components. We'll call this the facespace 
representation of our original images. 


3. For each of the test images, compute the closest match between the test image (represented as a 
k-dimensional vector in facespace) and the training images (again, in facespace). The notion of “closest 
match” here can be described in a few different ways, but the easiest thing to do is to use the the 
Euclidean distance to define how far apart two points are. In this way, you would look for the training 
point that has the smallest Euclidean distance for a particular test point and predict the identity of the 
test point to be the same as the identity of this closest training point. This method of classification 
is known as nearest neighbor classification and it is the one new concept you need to implement 
Eigenfaces. 


Exercise 20.4 


Earlier in this assignment you wrote some pseudo code for face compression, which as you can 
see from the description of Eigenfaces above, gets you most of the way there. Before you actually 
implement Eigenfaces, we'd like you to extend your pseudocode to cover the whole Eigenfaces 
algorithm. In addition to the steps of Eigenfaces describe above, you should also think about the 
steps needed to calculate the accuracy of your system (i.e., how often does it get the person’s identity 
correct). 
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Exercise 20.5 


Implement the eigenfaces algorithm. 


1. Your code, which can be a script, function, or livescript, should use the training and test sets 
of images provided: classdata_train.mat and classdata_test.mat, respec- 
tively. 


. You may want to start by identifying one face from the test set, but by the time you are done, 
your code should run through all of the test images and report the fraction it guesses correctly. 


. Test different numbers of eigenvectors. How many does it take to guess right most of the time? 


. If you’d like, time how long it takes your code to run. Can you do anything more efficiently to 
make it run faster? (use tic and toc in MATLAB for timing) 


. Visualize the first few eigenfaces. Can you interpret what they mean? 


. Generate a figure that depicts the success rate (accuracy at determining the identify of a person 
in an image) versus the number of eigenfaces used. 


. Generate a figure that depicts the success rate (accuracy at determining the identify of a person 
in an image) versus the number of eigenfaces used where the training data consists of only 
images of people not smiling and the test data consists of only images of people smiling (these 
are located in classdata_non_smiles.mat and classdata_smiles.mat re- 
spectively. Comment on the difference in performance on when using these files versus the 
data from the previous parts of this problem. 


Guidelines: 
* You should comment your code (use %) so others could read and understand it. 


» Don’t use the command pca, but instead build your algorithm using either the eig or eigs 
command (e€igs computes just a few eigenvectors, which can be faster when you only care 
about the eigenvectors with large eigenvalues). We want you to think through all the steps 
involved in your facial recognition program, and that means doing the math “yourself”. 


(Optional) Extensions (These are not spelled out in much detail. We recommend you talk to a member 
of the teaching team before trying these (especially the second two). 


+ Analyze the mistakes your algorithm makes (particularly when training on non-smiles and 
testing on smiles). 


Use Eigenfaces to do smile detection instead of identity recognition. 


Combine Eigenfaces with a classifier other than nearest neighbors (e.g., formulate an LSAE to 
create a series of one person versus everyone else detectors). 


Get your system working on live video. 
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Solution 20.1 
1. >> D = [-1 3; 1 4; 3 4; 7 5; 10 9] 
>> plot(D(:,1),D(:,2),'x') 
>> axis ([-3 12 -3 12]) 














2. >> mumatrix=[mean(D(:,1))*ones(5,1) mean(D(:,2))*ones(5,1) ] 
>> tildeD=D-mumatrix 
>> plot(tildeD(:,1),tildeD(:,2),'x') 
>> axis ([-7 7 -7 7]) 














3. >> [Vec,Diam]=eig(tildeD'*tildeD) 
>> hold on 
>> quiver(0,0,Vec(1,1),Vec(2,1)) 
>> quiver(0,0,Vec(1,2),Vec(2,2)) 
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>> 
>> 
>> 
>> 
>> 














proj=tildeD*Vec(:,2) 
B=proj*Vec(:,2)' 

hold on 

plot (tildeD(:,1),tildeD(:,2),'x') 
plot(B(:,1),B(:,2),'0') 

axis ([-6 8 -6 8]) 














Solution 20.2 


1. No. If we write a data point ap; + bp2, then the reduced dimension version is bp2. It’s 
impossible to recover a, which is the information in the perpendicular direction. 


2. We would get the information in the perpendicular direction, which we can interpret as the 
“error” in reducing the dimension from D to B. 


3. You can use the error B — D. 


4. We can write a new data point d as 


(d- p1)pi + (d- p2)pe. 
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Solution 20.3 


1. >> A= (1/sqrt(7304))*[b_tr-mean(b_tr) w_tr-mean(w_tr) s_tr-mean(s_tr) ]; 
>> R=A'*A 


2. >> [V,D]=eig(R) 
>> Vp=[V(:,2) VC:,3)] 


3. >> T = [b_new-mean(b_tr) w_new-mean(w_tr) S_new-mean(s_tr)]' 


4. >> alpha=Vp'*T; 


Solution 20.5 


A reference implementation of Eigenfaces is linked from the Canvas assignment page. 


Chapter 21 


Week 8a: Applications of PCA 





Schedule 
21.1 Debrief [20 minutes] .. 1... 0... 0 2. ee ee ee tee 181 
21.2 Applications of PCA (thinking it through conceptually) [30 mins] ......... 181 
21.3 Applying PCA to Analyze Movie Reviews (implementation) [40 minutes] .... . 183 





21.1 Debrief [20 minutes] 


We'd like you to take this opportunity to discuss with your group-mates the approach you took to facial 
recognition and the way you implemented it in MATLAB. The goal here is to think through the method 
at different levels from conceptual to code, resolving any confusion, and identifying whether any issues 
are primarily at the conceptual-level, the implementation-level, or the translation space in between. We 
highly recommend that you set aside your existing code and focus on developing the approach 
with your group-mates. Here are some steps to guide you. 

Draw a diagram that shows the different steps of doing the Eigenfaces algorithm for face identification 
(try to come up with a consensus for your group, but do take note of any major disagreements. Use boxes to 
encompass major steps of your algorithm and arrows to indicate which steps flow into which others. We 
will call this sort of diagram a process flow diagram. When constructing your process flow diagram, you 
might consider these questions. 


« Will you pre-process the data? If so, how? 

+ How will you compute the Eigenfaces? 

« How will you decide how many Eigenfaces to include? Which Eigenfaces might you leave out? 
* How will you identify a face? 

« How will you measure the accuracy of your implementation? 


On your process flow diagram, list some of the key strategies or steps you performed in order to implement 
this step (e.g., next to Compute the Eigenfaces perhaps you might list “subtract the mean” or “compute 
the Eigenvectors of the covariance matrix using the eigs function.”) 


21.2 Applications of PCA (thinking it through conceptually) [30 
mins | 
This is a repeat of what we had in the document last time. If you already had a chance 
to do this problem, your task is to come up with your example of how PCA could be 


applied to data analysis. You can choose the domain! 
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In this section you’re going to be thinking about what the PCA algorithm might do when applied in 
different domains. The focus of this section will be on trying to understand at a conceptual level what might 
happen when we apply PCA. In the next section, you’ll be reading through an example of applying PCA to 
some actual data. 


Exercise 21.1 


For each application, hypothesize what the first principal component might be. That is, for each 
particular scenario what would the direction be that maximizes the variance of the data projected 
onto that direction? What might the second principal component be (that is a vector orthogonal to 
the first that maximizes the variance of the data)? 


1. Consider a dataset consisting of ratings from n users of m movies. Let’s assume that the 
ratings are numerical and are on a scale of 1 to 5 (5 being the best). Consider some collection 
of movies (they could be some specific movies or you could just think of movie genres) and a 
particular population of users (could be college students, QEA professors, or just the general 
population). Draw the data matrix A and label the rows and columns (e.g., with movies or 
users). In a qualitative sense, make a prediction as to what the first principal component would 
look like for this dataset. What might the second principal component look like? No numbers... 
just guess at which dimensions would be positive, negative, or close to o for your principal 
components. 


2. Consider a dataset consisting of the prevalence of the flu in various parts of the US. The CDC 
maintains an animated map of the flu activity over time, which you can (and should) access 
at https: //www.cdc. gov/flu/weekly/usmap. htm. To simplify this data, let’s 
think about the number of flu cases in each of the six major major geographical regions of the 
US. 


If we think about our data matrix as consisting of a row for each week of measured flu activity 
and each column as a region of the US, in a qualitative sense, make a prediction as to what the 
first principal component would look like for this dataset. What might the second principal 
component look like? No numbers... just guess at which dimensions would be positive, 
negative, or close to o for your principal components. 


Exercise 21.2 


With your table-mates, read through this post that shows the application of PCA to understanding 
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the US political leanings (if you are viewing this in DropBox preview and can’t click the link, go to 
http://bit.ly/37n9qwe). Before, starting here are some process suggestions. 


» Checkin with folks at your table as to how they’d like to go through this document (e.g., read 
the entire thing individually and come together and ask questions, read it individually but stop 
after each major section to ask questions, read it aloud as a table). 


+ If you don’t understand something, you can either call over an instructor or note your confusion 
on the whiteboard and keep going (e.g., if its something that doesn’t impede your understanding 
of the main points in the article). 





21.3 Applying PCA to Analyze Movie Reviews (implementation) 
[40 minutes | 


We're going to be using a LiveScript notebook to go through an example of using PCA to analyze movie 
ratings. This will be an instructor led activity, but we’d love to see lots of participation in the chat (or 
verbally). If you'd like to run the code for yourself so that you can follow along more easily, you can get it 
from MATLAB drive (shortened link for those that are following along in Miro: https://bit.ly/30toKOv). We 
are also including the output of the notebook here for easy reference. 


Analyzing Movie Ratings Using Principal Components Analysis 


In this notebook we're going to be working with the MovieLens dataset. In particular, we're going to be working 
with the MovieLens 1M dataset, htips://grouplens.org/datasets/movielens/1m/, which contains: 





¢ 1 million ratings 
* from 6000 users 
* on 4000 movies 


The dataset is pretty old (from 2003), so some of the movies (that the QEA teaching team loved when they were 
still youngins) are now considered "classics". Yikes! 


The goals of this activity are threefold. 


1. To work with a different type of data than images or temperatures (here we will be working with ratings). 
Applying the tools you have learned in this module to different domains will help solidify your learning, 
help you see connections, and potentially get you excited for your module 1 project. 

2. To see a few different techniques for examining the results of PCA. 

. To see the connection between PCA on rows versus columns of a data matrix. 

4. To have some fun! 


ao 


To get started, we're going to load the data and display a little bit of the data. Please see the comments in the 
code for some more information. 


load('movielens.mat'); 


sizeOfMovies = size(movies) 
sizeOfMovies = 
3706 3 
% the cell array “movies” is 3706 by 3. Each of the 3706 entries correspopnds to a 





oe 


particular movie, and along the second dimension the entries correspond to the movie 
the movie title, and the movie genre 


al? 


al? 


Here we extract the information about the first movie in the dataset 


— of 





movield, movieTitle, movieGenre] = movies{1,:} 
movield = 
ul 
movieTitle = 
"Toy Story (1995)! 
movieGenre = 


"Animation|Children's|Comedy' 





sizeOfUsers = size(users) 
sizeOfUsers = 
6040 5 
% the cell array ‘users’ is 6040 by 5. Each of the 6040 entries correspopnds to a 








fol} 


particular user, and along the second dimension the entries corresopnd to the user II 


% gender (unfortunately, coded in a binary fashion), zip code, age bracket, 
% and occupation. 

% Here we extract the information about the first user in the dataset 
[userlId, genderBinary, zipCode, ageBracket, occupation] = users{1l,:} 


userId = int64 


1 
genderBinary = 
'fl 
zipCode = 
"48067' 
ageBracket 
"Under 18' 
occupation = 
"K-12 student! 


figure; 
histogram(categorical ({users{:,2}})); 
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figure; 
histogram(categorical ({users{:,4}})); 
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atingsSize = size(ratings) 


ratingsSize = 1x2 
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6040 3706 


the matrix ‘ratings’ is 6040 by 3706 and encodes the rating that a 
particular user (row) gave to a particular movie (column). The ratings 
are 1, 2, 3, 4, OF 5 Stars or Ehe special value NaN (not a number), af the 
user didn't rate that particular movie. 


Let's look at the ratings that were given to the first movie in the 
dataset, which as we saw is Toy Story. We can do this using the 
histcounts function (we'll ignore missing values in this analysis) 








ossibleRatings = [1 2 3 4 5]; 
Note: the Inf here is needed to capture the 5 star ratings (see the 
documentation of “histcounts”’ for details. 
Ratings = histcounts(ratings(1,:), [possibleRatings Inf]); 
igure; 
ar(possibleRatings, nRatings); 


label ('Rating'); 


ylabel('Number of Users'); 


<4 








itle(['Ratings for ', movieTitle]) 


Ratings for Toy Story (1 995) 
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Okay, yeah that was a pretty great movie. Let's check out a less good movie, Anaconda, (this one was a sort of 
love it or hate it type of movie). Highly recommended!! Look at this cast htips://www.imdb.com/title/tt0118615/ 
fullcredits !!! 








[movield, movieTitle, movieGenre] = movies{1384,:} 


movield = 


1499 
movieTitle = 
"Anaconda (1997)! 
movieGenre = 
‘Action|Adventure|Thriller' 





nRatings = histcounts (ratings (1384,:), [possibleRatings Inf]); 
figure; 
bar (possibleRatings, nRatings) ; 
xlabel('Rating'); 

ylabel('Number of Users"); 
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Ratings for Anaconda (1997) 
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Cleaning up the Data 
Before we start analyzing this data, we're going to do a few things to make the problem a bit easier to handle 
(we'll elaborate on each of these steps a bit when we actually write the code to perform the cleaning operation). 


1. Filter out movies that had 500 or less reviews and users who reviewed 100 movies or less. All of the 
analysis we're going to do today would work with the original data, but the results are a bit harder to 
interpret. 

2. Fill in any missing entries by using the average rating for the movie. 

3. Subtract the mean rating for reach movie and then subtract the mean rating for each user. This will focus 
our model on understanding interactions between particular movies and particular users rather than 
capturing things like segmenting good versus bad movie or critical versus uncritical users. 


The first step is to filter out rare movies and users that didn't review many movies. The helper function we call 
below filterOutRarities is defined in the last cell of this notebook. 


% Note: several helper functions, including ~filterOutRarities’, are defined 
Sane ches Waist ice tor Vehals lavesermalpic 
[ 











movies, users, ratings] = filterOutRarities(movies, users, ratings, 500, 100); 
moviesSize = size(movies) 
moviesSize = | 
617 3 
usersSize = size(users) 








usersSize = 


2909 > 
ratingsSize = size(ratings) 
ratingsSize = 

2909 617 


Since we're going to be applying PCA to this data we're going to have to deal with the fact that we have a bunch 
of missing values in our ratings matrix (i.e., movies that particular users did not rate). Filling in missing values 
is called data imputation in the field of data science. There are many ways to do data imputation, but we've 
chosen a particularly easy strategy of simply replacing any ratings with the average rating of that particular 
movie (e.g., if a user didn't rate Toy Story, we would fill it in with the average rating of Toy Story based on the 
other users in the dataset who actually rated that movie). 


ratingsFilled = fillmissing(ratings, 'constant', nanmean(ratings)); 





As a final data cleaning step, we're going to subtract out the mean of each column and then each row. These 
two steps will focus our analysis on the interaction of movies and users (rather than on either entity in isolation). 


ratingsMean0 = ratingsFilled - mean (ratingsFilled) ; 
ratingsMeanO = ratingsMeanO - mean(ratingsMean0O, 2); 


ie} 


% verify that we end up with a matrix with mean 0 for each row and column 
disp(['Absolute value of sum of columns means ', num2str(sum(abs (mean (ratingsMean0) ))) ] 


Absolute value of sum of columns means 5.2097e-11 

disp(['Absolute value of sum of row means ', num2str(sum(abs (mean(ratingsMean0O,2))))]); 
Absolute value of sum of row means 1.7597e-13 

size (ratingsMean0O) 


ans = 
2909 617 


PCA on Users 


What does it mean to do PCA where we think of each user as an observation and each dimension as a rating of 
a movie? 


¢ We need to make sure that our matrix has movies across the columns and users are represented as 
rows. 
* This is already the shape of our matrix ratingsMeanO, so we are good to go. 


Now we can do our standard procedure for PCA: compute the covariance matrix and its principal Eigenvectors. 





Welne nO included the normealuzancvem cenm or 2/ (N—1)) here as tt cocsm!) & 
affect the BHigenvectors and it will allow us to make an interesting 
connection later on. 

moviesByMovies = ratingsMean0O'*ratingsMean0O; 

[V, D] = eigs(moviesByMovies, 4); 





AP AP oP 








size (V) 


ans = 
617 4 


figure; 

plot (diag(D)); 

title('PCA with Users as Observations and Movies as Dimensions') 
xlabel('Principal Component Number'); 

ylabel ('Eigenvalue'); 




















PCA with Users as Observations and Movies as Dimensions 
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Understanding the Principal Components 

The principal components themselves tell us something about the dimensions of variability among individual 
user ratings. Let's look at the first principal component and see if it correlates with anything we know about the 
individual users. 


Examining Movies 


The Eigenvectors we just computed are 617 elements, where each element corresponds to one of our movies. 
One way to try to understand what these Eigenvectors represent is to examine movies that have the highest 
and lowest values for each of these Eigenvectors. 
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highLowMovies = getHighAndLowMovies (V(:,i), movies) 
end 


Component 1 


highLowMovies = 














































































































1 2 3 
1 'Independ... | 'Action|S... 0.1968 
Z ‘Armagedd... | 'Action|A... 0.1487 
3 ‘Jurassic... | 'Action|A... 0.1390 
ts ‘Star War... | 'Action|A... 0.1344 
5 "Twister ... ‘Action|A... 0.1205 
6 ‘Lost Wor... | 'Action|A... 0.1135 
u ‘Men in B... | 'Action|A... 0.1078 
8 "Rock, Th... | 'Action|A... 0.1075 
2 ‘Speed (1... | 'Action|R... 0.1040 

v ‘Forrest... | 'Comedy|R... | 0.1013 
4 |'Fargo (1... |'Crime|Dr... | -0.1061 
z ‘Being Jo... | 'Comedy' -0.0977 
3 ‘Rushmore... | 'Comedy' -0.0924 
e '‘Clockwor... | 'Sci-Fi' -0.0879 
5 ‘American... | ‘Comedy|D.... -0.0858 
© ‘Pulp Fic... ‘Crime|Dramat -0.0794 
a '2001:A... |'Drama|My... -0.0781 
8 ‘Raising ... | 'Comedy' -0.0759 
2 ‘Election... | 'Comedy' -0.0732 
20 ‘Annie Ha... | 'Comedy]R... | -0.0717 
Component 2 
highLowMovies = 
1 2 3 
t 'Star War... | 'Action|A... 0.1858 
2 ‘Back to... | 'Comedy|S... | 0.1746 
3 'E.T. the... ‘Children... | 0.1657 
My ‘Terminat... | 'Action|S... | 0.1602 
5 'Star War... | 'Action|A... | 0.1563 
o ‘Star War... | 'Action|A... | 0.1509 
y 'Ghostbus... | 'Comedy|H... | 0.1483 
8 'Terminat... | 'Action|S... 0.1350 
2 ‘Jurassic... | 'Action|A... 0.1344 
ue ‘Jaws (1975)'| 'Action|H... 0.1332 
“1 | ‘Batman &... | 'Action|A... -0.0783 















































































































































1 2 3 
2 ‘Wild Wil... ‘Action|S... -0.0762 
3 'Armagedd... | 'Action|A... -0.0756 
e ‘Judge Dr... | 'Action|A... -0.0706 
5 ‘Saint, T... ‘Action|R... -0.0699 
3 'Entrapme... | 'Crime|Th... -0.0683 
of 'Con Air... | 'Action|A... -0.0639 
8 ‘Double J... | 'Action|T... -0.0638 
9 ‘Congo (1... | 'Action|A... -0.0632 
zy 'Gone in... | 'Action|C... -0.0617 
Component 3 
highLowMovies = 
1 2 3 
4 Titanic ... ‘DramalRo... 0.1369 
Z ‘Ghost (1... | 'Comedy|R... 0.1205 
8 ‘Pretty W... | 'Comedy|R... 0.1203 
ty ‘When Har... | 'Comedy|R... 0.1144 
5 'Sleeples... | 'Comedy|R... 0.1141 
6 ‘Little M... ‘Animatio... 0.1125 
uf ‘Beauty a... | 'Animatio... 0.1089 
8 ‘Aladdin... | ‘Animatio... 0.1029 
2 ‘Mary Pop... | ‘Children... 0.1014 
© ‘Sound of... | ‘Musical’ 0.1002 
‘ ‘Starship... | 'Action|A... -0.1290 
: ‘Matrix, ... ‘Action|S... -0.1217 
3 ‘Mars Att... | ‘Action|C... -0.1109 
. ‘Pulp Fic... | 'Crime|Drama’ -0.1047 
5 ‘Fifth El... ‘Action|S... -0.1033 
c ‘Austin P... | 'Comedy' -0.0924 
if ‘From Dus... | 'Action|C... -0.0918 
8 ‘South Pa... | 'Animatio... -0.0899 
2 'Star War... | 'Action|A... -0.0893 
20 ‘Aliens (...._ | 'Action|S... -0.0881 
Component 4 
highLowMovies = 
1 2 3 
4 ‘There's... | 'Comedy' 0.1686 






























































1 2 3 
2 ‘American... |'Comedy|D... 0.1611 
3 ‘Forrest... | 'Comedy|R... 0.1581 
o ‘Bravehea... | 'Action|D... 0.1371 
2 ‘Pulp Fic... | ‘Crime|Drama’ 0.1258 
6 ‘Saving P... | 'Action|D... 0.1245 
uf ‘Good Wil... | ‘Drama’ 0.1168 
8 ‘Matrix, ... ‘Action|S... 0.1152 
9 ‘American... | 'Comedy' 0.1149 
0 ‘Gladiato... | 'Action|D... 0.1125 
t ‘Star Tre... ‘Action|A... -0.1163 
2 ‘Star Tre... ‘Action|A... -0.1162 
s ‘Rocky Ho... | 'Comedy|H... -0.1120 
a ‘Star Tre... ‘Action|A... -0.1072 
5 ‘Star Tre... | 'Action|A... -0.0969 
6 "Wizard o... | ‘Adventur... -0.0918 
i ‘Star Tre... | 'Action|A... -0.0903 
8 ‘Star Tre... ‘Action|A... -0.0898 
9 ‘Star Tre... ‘Action|S... -0.0889 
20 ‘Star Tre... | 'Action|A... -0.0829 





There's clearly a lot more you could do to look at this, but let's forge ahead. (Note: the next thing we'd probably 
do is something that replicates the analysis of age below but for movie genre). 


Examining Users 


Now that we've examined what the principal components tell us about the movies, we will try to understand 
what the principal components tell us about the users. To do this, we will project each user into the principal 
components space and examine those values (these are the alphas that you saw in the homework). 


Since all we know about thes users is a few basic pieces of demographic information, we're going to see if any 
of these alpha dimensions correspond to these demographics. The first one we are going to look at is gender. 


Sorry about the demographic data being in terms of gender binary. We acknowledge that this is not 
representative of reality, but we are engaging with it in an attempt to understand the data. We should 
always be conscious when drawing inferences to consider the fact that the encoding of gender was 
done in this way. 


figure; 

isFemale = cellfun(@(x) strcmp('F',x), {users{:,2}}); 
nbins = 20; 

alphas = ratingsMean0O*V; 

for i = 1:size(alphas, 2) 
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SUbpiOr (Aiea) ey 
histogram(alphas(isFemale, i), nbins, 'Normalization', 'probability'); 
hola on; 
% the ~ just means not (so ~isFemale means male) 
histogram(alphas(~isFemale, i), nbins, 'Normalization', 'probability'); 
Pielke eonponenias 7 mumZsitera (9) Ml) 
legend({'female', 'male'}, 'location', 'best'); 

end 




















03 Component 1 0.3 Component 2 
[HEG female HEE female 
0.25} | HE male | 0.25 HE male 



































Component 3 Component 4 





0.3 





0.3 - 









































Next we'll take a look at age and see if any of these components seem to correlate with that. 


fig = figure; 





fig.Units = 'centimeters'; 
% change figure size 
fig.Posittion(3 34) = [45 1001); 


agebrackeus — (Under ely Yisan4 2534 ea a ASO SO =sa 7. oot. je 
for a — l:size (alphas, 2) 

subploE(Gy ly), 

for j = 1l:length(ageBrackets) 














isInAge = cellfun(@(x) strcmp (ageBrackets{j},x), {users{:,4}}); 
histogram(alphas(isInAge, i), nbins, 'Normalization', 'probability'); 
noel one: 

end 

(caktealker( (MC ompe tre mics! smum2/siterer (9) ili) Fe 





legend(ageBrackets, 'location', 'best'); 
end 
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Component 1 
T T 































0.3 T T T T 
O.255 a 
o2t HE Uncer 18 | | 
: HE 18-24 
[EG 25-34 
0.15 - Mi s5-44 + 
[45-49 
01k [iG 50-55 
: MG 56+ 
0.05 








-10 -8 -6 -4 -2 ie) 2 4 6 8 


Component 2 
; T 

















0:35 T 1 
0.3 + _ 
0.25 (EG Under 18 | | 
GE 18-24 
0.25 Gi 25-34 5 
GE 35-44 
0.15 + G49 4 
Gi 150-55 
0.1 GG 56+ 
0.05 











Component 3 
7 T 














0.35 T T T T 
0.3 - a] 
0.25 F HE Under 18 | 7 
HE 18-24 
0.25 Mi 25-34 IF 
HE 35-44 
0.15 - M45-49 + 
[iG 50-55 
0.1 Ml 56+ 
0.05 











Component 4 














0.3 T T T T T T 
0.25 | A 
oak HE Unceer 18 | | 
[18-24 
HE 25-34 
0.15 + GG 35-44 + 
[EG 45-49 
oak [iG 50-55 
: HG 56+ 
0.05 - 4 











As a final step, we're going to look at users that have either very high or very low values for each dimension of 
alpha. You can think of this as analogous to what we did when we looked at movies that had high or low values 
for each of the Eigenvectors in V. 
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in@ue a, = il § Sabwe (euljoinee, 2), 
[~, highestUserIndex] 

















= max(alphas(:,i)); 


[~, lowestUserIndex] = min(alphas(:,1)); 





Gaksps( i: 


Gls (IM@empone nt 47 semUmn2Z(sitei (Gis) ulp)e: 

disp('User with the largest component"); 
users{highestUserIndex, :} 
disp('This user rated the following movies as high and low'); 
getHighAndLowUserRatings (highestUserIndex, 
disp ('User with the smallest 








users {lowestUserIndex, 


Bh 





movies, ratings) 
(probably negative) 


component'); 


disp('This user rated the following movies as high and low'); 
getHighAndLowUserRatings (lowestUserIndex, 


end 


Component 1 
User with the largest component 
ans = 


1737 
ans = 
™! 
ans = 
'46614' 
ans = 
'35-44' 
ans = 
'writer' 





This user rated the following movies as high and low 





















































alls» = 
1 2 

1 Ferris B... 5 
2 ‘"Spacebal... 5 
3 “'Backdraf... 5 
2 "Toy Stor... 5 
5 'Green Mi... 5 
6 | 'Galaxy Q... 5 
t 'Frequenc... 5 
8 ‘Predator... 5 
2 ‘Running ... 5 
a0 ‘Almost F... 5 
Wy ‘English ... 1 
We “Blair Wi... 1 
13 ‘Eyes Wid... 1 
ifs ‘Man on t... 1 
1S | 'Get Shor... 2 
AG (oe 2 
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movies, 


ratings) 

















1 2 
ee 'Nightmar... 2 
te ‘Dances w... 2 
i ‘Star Tre... 2 
ay 'L.A. Con... 2 

















User with the smallest (probably negative) component 
ans = i 


5A95 
ans = 
™! 
ans = 
'92688' 
ans = 
"25-34! 
ans = 
"academic/educator' 
This user rated the following movies as high and low 
ans = 20x2 

































































1 2 
\ ‘Fast Tim... 5 
& ‘American... 5 
3 | JFK (1... 5 
a ‘Muppet M... 5 
5 ‘Animal H... 5 
6 ‘Double I... 5 
uf ‘Close En... 5 
8 ‘Misery (... 5 
9 ‘Network ... 5 
10 ‘Diner (1... 5 
a ‘Broken A... 1 
U2 ‘Happy Gi... 1 
s ‘Rumble i... 1 
Vee ‘Congo (1... 1 
is ‘Desperad... 1 
ie ‘Judge Dr... 1 
Ve ‘Net, The... 1 
ue ‘Waterwor... 1 
te ‘Outbreak... 1 
ay ‘While Yo... 1 

















Component 2 
User with the largest component 
ans = 


14 


1340 
ans = 
™! 
ans = 
'14302' 
ans = 
"25-34! 
ans = 
"executive/managerial' 
This user rated the following movies as high and low 
ans = 

































































1 2 
t ‘Airplane... 5 
2 ‘American... 5 
3 "Total Re... 5 
fs ‘Robocop ... 5 
> ‘Trading ... 5 
6 ‘Fatal At... 5 
g "Wayne's ... 5 
5 ‘Thelma &... 5 
2 ‘Close En... 5 
ue ‘Naked Gu... 5 
uy 'GoldenEy... 1 
ie ‘Leaving ... 1 
ts ‘Dead Man... 1 

eon Holl... 1 
is ‘Broken A... 1 
us ‘Batman F... 1 
ue ‘Congo (1... 1 
We ‘Die Hard... 1 
uy ‘Judge Dr... 1 
20 ‘Net, The... 1 

















User with the smallest (probably negative) component 
ans = 


2807 
ans = 
'f! 
ans = 
"22043' 
ans = 
"35-44! 
ans = 
'lawyer' 
This user rated the following movies as high and low 
ans =. 
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1 2 
\ ‘Arsenic ... 5 
Z "Young Fr... 5 
© ‘Truman S... 5 
t ‘As Good ... 5 
5 ‘Bulworth... 5 
© ‘Seven Sa... 5 
g ‘Roger & ... 5 
8 ‘Player, ... 5 
& ‘Producer... 5 
ug ‘Being Jo... 5 
ut ‘Jurassic... 1 
We 'Terminat... 1 
us ‘Independ... 1 
we ‘Ransom (... 1 
1s ‘Raiders ... 1 
1G ‘Aliens (... 1 
7 "Alien (1... 1 
ug "Terminat... 1 
Wy ‘Back to ... 1 
20 | "Splash (... 1 

















Component 3 
User with the largest component 
ans = 


2073 
ans = 
‘FT! 
ans = 
"13148! 
ans = 
'18-24' 
ans = 
"college/grad student' 
This user rated the following movies as high and low 
ans = 























1 2 
q ‘Magnolia... 5 
2 ‘Breaking... 5 
3 ‘American... 5 
ts ‘Muppet M... 5 
5 ‘Erin Bro... 5 
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1 D 

5 ‘Good Mor... 5 
7 "High Fid... 5 
c ‘What Abo... 5 
2 ‘Almost F... 5 
ue ‘Best in ... 5 
Ht ‘Desperad... 1 
12 ‘Mask, Th... 1 
ue) ‘Fugitive... 1 
a ‘In the L... 1 
iS ‘Tombston... 1 
iG ‘Night of... 1 
i ‘Aliens (... 1 
ue ‘Good, Th... 1 
19 Nikita (... 1 
20 'Unforgiv... 1 

User with the smallest (probably negative) component 

ans = i I 

2015 

ans: = 

™! 

ans: (= 

'01003' 

ans = 

'18-24' 

ars: -= 


"college/grad student' 
This user rated the following movies as high 
ans = 



































1 2 

4 ‘Shanghai... 5 
Z ‘Moonrake... 5 
3 ‘Blazing ... 5 
ty ‘Running ... 5 
5 'Mad Max ... 5 
© ‘Big Trou... 5 
uf ‘What Abo... 5 
8 ‘Naked Gu... 5 
& ‘Best in ... 5 
5 


10 ‘Meet the... 

















and low 
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1 2 
a ‘Toy Stor... 1 
iz ‘Species ... 1 
13 "Willy Wo... 1 
14 |"English ... 1 
iS ‘Grease (... 1 
ie ‘My Best... 1 
We ‘Ice Stor... 1 
uc 'Breakfas... 1 
iy ‘Splash (... 1 
20 ‘Babe: Pi... 1 

















Component 4 
User with the largest component 
ans = i 


3610 
ans = 
™! 
ans = 
"30064' 
ans = 
'18-24' 
ans = 
"doctor/health care' 
This user rated the following movies as high 
ans = 20 


















































1 2 
\ ‘Entrapme... 5 
2 ‘Mummy, T... 5 
3 'Big Dadd... 5 
‘Sixth Se... 5 
5 "13th War... 5 
° ‘World Is... 5 
u ‘End of D... 5 
& 'Gladiato... 5 
9 ‘Mission... 5 
ug ‘Meet the... 5 
‘1 | "Babe (1995)' 1 
We ‘Birdcage... 1 
US ‘Star Tre... 1 
4 "Philadel... 1 
us ‘Singin’ ... 1 

















and low 
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1 2 
up Vertigo ... 1 
uy ‘Rear Win... 1 
whe ‘North by... 1 
19 "Some Lik... 1 
ae ‘Casablan... 1 

















User with the smallest (probably negative) component 


ans = 





1150 


ans 
'F! 
ans = 

“TDS226." 

ans = 

"25-34! 

ans = 

‘writer' 

This user rated the following movies as high 
ans = 






























































1 2 
‘ 'Godfathe... 5 
z ‘Annie Ha... 5 
= ‘Duck Sou... 5 
w ‘Big Lebo... 5 
5 ‘Roger & ... 5 
6 ‘Jungle B... 5 
u ‘Lady and... 5 
8 ‘Hard Day... 5 
9 ‘Being Jo... 5 
ue ‘Blazing ... 5 
1 "Babe (1995) 1 
We ‘Clueless... 1 
us "Mr. Holl... 1 
Ure ‘Congo (1... 1 
5 ‘Desperad... 1 
uo ‘Intervie... 1 
uy ‘Legends ... 1 
ne ‘Forrest ... 1 
19 \"Hot Shot... 1 
20 ‘Much Ado... 1 

















and low 
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PCA on Movies 


Now that we've done a PCA analysis where users were observations and movies were dimensions, we might 
wonder what sorts of interesting results would we have gotten if we had done the analysis with movies as 
observations and users as dimensions. Maybe we'd find out something new and cool! 


Luckily it's pretty easy to try this. Instead of computing our movies by movies matrix as we did above, we will 
compute a users by users matrix. Since our original data has mean 0 across both rows and columns we don't 
even have to worry about removing the mean. All we need to do is use the transpose of our data matrix when 
doing the calculations we did above. 














usersByUsers = ratingsMean0*ratingsMean0O'; 
[V, D] = eigs(usersByUsers, 4); 
size (V) 
ans = 
2909 4 
figure; 
ploE (dtagtD) }y 
xlabel('Principal Component Number'); 
ylabel ('Eigenvalue'); 

















12000 T T T T 


11000 


10000 


9000 


8000 


Eigenvalue 


7000 


6000 





5000 











4000 1 l 1 | 1 
1 1.5 2 2.5 3 3.5 4 


Principal Component Number 
Looks a little bit familiar, but let's hold that thought :). 
Understanding the Principal Components 
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Just as before let's try to understand what the principal components are telling us. 


Examining Users 


We'll do the same analysis we did on users as we did before. 


First, let's look at the gender binary. 


figure; 
nbins = 20; 
fom a= Ansshizer(,-2)) 





Silo pelkoites (2) es 


histogram(V(isFemale, i), nbins, 'Normalization', 'probability'); 
lavoidkel als 
histogram(V(~isFemale, i), nbins, 'Normalization', 'probability'); 


title(['Component 








U rome srcie (aL) 1\)) 6 








legend({'female', 'male'}, '‘location', 'best'); 
end 
03 Component 1_ 0.3 Component 2 
HE female MEG female 
0.25 | [male 0.25 [GS male 




















-0.05 0 0.05 0.1 





Component 3 




















-0.1 


-0.05 0 0.05 30.1 





0.3 


0.25 




















-0.1 -0.05 0 0.05 0.1 


Component 4 


HES female 
HG male 























-0.1 


-0.05 0 0.05 0.1 


Next we'll take a look at age and see if any of these components seem to correlate with that. 


fig = figure; 

fig.Units = 'centimeters'; 

% change figure size 

iba, oyssLinwoim (ss) = [ais IOI) 2 

ageBrackcr ss — se iuUmaleiqs hGH Weebl o a ee eo oe 
ioe Gh > Ji aasileaey (AW) 








subplot (4,1,1i); 


for 3 =: 





length (ageBrackets) 


ESO Soir, 
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'56+"}; 


[InAge = cellfun(@(x) 





stremp (ageBrackets{j},x), 





histogram(V(isInAge, i), nbins, 'Normalization', 





is] 

hoe om 
end 
title(['Component 


end 


iene 2 Scie (4L)) 1) )) 5 
legend(ageBrackets, 'location', 


Yoesu)e 


Ze 


(wsiens {i374 lah) 
japacieyslomillalicey\ \i 


Component 1 
T 



























0.25 T T 
0.2 + al 
MEG Under 18 
0.155 GE is-24 || 
: GEG 25-34 
HE 35-44 
otf HE 45-49 || 
: [iG 50-55 
HG 56+ 
0.05 








-0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 


Component 2 
T T 

















0.3 T T 
0.25 + 4 
o2k EG Uncer 18 | | 
; GE 18-24 
HE 25-34 
0.15 + GH 35-44 /F 
(GG 45-49 
o1- GE 50-55 || 
: HG 56+ 
0.05 








-0.1 -0.05 0 0.05 0.1 





Component 3 
i 


















0.35 T 1 T 
0.3 - 4 
0.25 - GEG Under 18 | 7 
GE 18-24 
0.2+ Mi 25-34 IF 
GE 35-44 
0.15 ME 45-49 | 
(50-55, 
0.1 Gl 56+ 
0.05 








-0.1 -0.05 0 0.05 0.1 





Component 4 




















0.3 T T 
0.25 | A 
oak HE Uncer 18 | | 
[18-24 
HE 25-34 
0.15 + GG 35-44 + 
EG 45-49 
oak [GG 50-55 
; MG 56+ 
0.05 








As a final step to look at users, we're going to look at users that have either very high or very low values for 
each principal component. 











iene sk Sl S Galwe(w, 2) 
[~, highestUserIndex] = max(V(:,1)); 
[~, lowestUserIndex] = min(V(:,i)); 
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ekisjo(().)) 9 
disp (i*€emponens "7 mumZ sit (a); 


disp('User with the largest component"); 





users{highestUserIndex, :} 





disp('This user rated the following movies as high and low'); 
getHighAndLowUserRatings (highestUserIndex, 


disp('User with the smallest (probabl 





users{lowestUserIndex, :} 








movies, ratings) 


ly negative) 


component"); 


disp('This user rated the following movies as high and low'); 
getHighAndLowUserRatings (lowestUserIndex, 


end 


Component 1 
User with the largest component 
ans = 


5795 
ans = 
™! 
ans = 
'92688' 
ans = 
"25-34! 
ans = 
"academic/educator' 
This user rated the following movies as high 
ans = 



























































1 2 
1 ‘Fast Tim... 5 
2 ‘American... 5 
3 | 'FK (1... 5 
d ‘Muppet M... 5 
2 ‘Animal H... 5 
6 ‘Double I... 5 
a ‘Close En... 5 
8 ‘Misery (... 5 
2 ‘Network ... 5 
ug ‘Diner (1... 5 
WL ‘Broken A... 1 
tz ‘Happy Gi... 1 
1S "Rumble i... 1 
ee ‘Congo (1... 1 
15 ‘Desperad... 1 
16 | "Judge Dr... 1 
aM ‘Net, The... 1 
us ‘Waterwor... 1 




















and low 
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movies, 


ratings) 





1 





19 


‘Outbreak... 


1 





20 








‘While Yo... 





1 








User with the smallest 


ans = 


1737 


ans = 
™! 
ans = 


"46614" 


ans = 


"35-44! 


ans = 


'writer' 
This user rated the following movies as high and low 


ans. = 


(probably negative) 





1 








‘Ferris B... 


"Spacebal... 





‘Backdraf... 





"Toy Stor... 





‘Green Mi... 





"Galaxy Q... 





"Frequenc... 





‘Predator... 





"Running ... 


‘Almost F... 


oa oa oa oa oa oa oa oa oa oa 





‘English ... 





‘Blair Wi... 





‘Eyes Wid... 





‘Man ont... 





‘Get Shor... 





"Legends ... 








"Nightmar... 


‘Dances w... 





"Star Tre... 





20 








'L.A. Con... 











Component 2 
User with the largest component 


ans = 


2807 


ahs > 
ip 


component 
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ans = 
'22043' 

ans = 

"35-44! 

ans = 

‘lawyer' 

This user rated the following movies as high and low 
ans = 4 






























































1 2 
! ‘Arsenic ... 5 
2 "Young Fr... 5 
3 ‘Truman S... 5 
g ‘As Good ... 5 
5 ‘Bulworth... 5 
o ‘Seven Sa... 5 
¢ ‘Roger & ... 5 
8 ‘Player, ... 5 
2 ‘Producer... 5 
ue ‘Being Jo... 5 
WL ‘Jurassic... 1 
We "Terminat... 1 
us 'Independ... 1 
te ‘Ransom (... 1 
ie ‘Raiders ... 1 
ug ‘Aliens (... 1 
7 "plien (1... 1 
Me 'Terminat... 1 

pis ‘Back to ... 1 
20 ‘Splash (... 1 

















User with the smallest (probably negative) component 
ans = I 


1340 
ans = 
™! 
ans = 
'14302' 
ans = 
"25-34! 
ans = 
"executive/managerial' 
This user rated the following movies as high and low 
ans = 20 2 








i 2 





\ ‘Airplane... 5 
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1 2 
2 ‘American... 5 
3 "Total Re... 5 
4 ‘Robocop ... 5 
2 ‘Trading ... 5 
6 ‘Fatal At... 5 
i "Wayne's ... 5 
8 ‘Thelma &... 5 
9 ‘Close En... 5 
ie ‘Naked Gu... 5 
H 'GoldenEy... 1 
We ‘Leaving ... 1 
18 | "Dead Man... 1 
he "Mr. Holl... 1 
IS ‘Broken A... 1 
ue ‘Batman F... 1 
Wg ‘Congo (1... 1 
us ‘Die Hard... 1 
ug ‘Judge Dr... 1 
20) ‘Net, The... 1 

















Component 3 
User with the largest component 
ans = i 


2073 
ans = 
'F! 
ans = 
'13148' 
ans = 
'18-24' 
ans = 
"college/grad student' 
This user rated the following movies as high 
ans = 























1 2 
t ‘Magnolia... 5 
Z ‘Breaking... 5 
© ‘American... 5 
ty ‘Muppet M... 5 
2 Erin Bro... 5 
5 ‘Good Mor... 5 

















and low 
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1 2 
# "High Fid... 5 
8 ‘What Abo... 5 
9 ‘Almost F... 5 
ie ‘Best in ... 5 
UL ‘Desperad... 1 
a ‘Mask, Th... 1 
us ‘Fugitive... 1 
We In the L... 1 
is ‘Tombston... 1 
us ‘Night of... 1 
ue ‘Aliens (... 1 
= 5) ‘Good, Th... 1 
us "Nikita (... 1 
20 'Unforgiv... 1 

User with the smallest (probably negative) component 

ans = i I 

2015 

ans = 

™! 

ars. = 

'01003' 

ans = 

'18-24' 

ans = 


"college/grad student' 
This user rated the following movies as high 
ans = 


















































1 p 
\ ‘Shanghai... 5 
2 ‘Moonrake... 5 
3 ‘Blazing ... 5 
Z ‘Running ... 5 
5 'Mad Max ... 5 
6 ‘Big Trou... 5 
uf ‘What Abo... 5 
5 ‘Naked Gu... 5 
2 ‘Best in ... 5 
ie ‘Meet the... 5 
wy ‘Toy Stor... 1 





and low 


28 
































1 2 
12 ‘Species ... 1 
13 "Willy Wo... 1 
ve ‘English ... 1 
15 ‘Grease (... 1 
ue 'My Best... 1 
uy ‘Ice Stor... 1 
ug 'Breakfas... 1 
Wy ‘Splash (... 1 
au ‘Babe: Pi... 1 

















Component 4 
User with the largest component 
ans = i ] 


1150 
ans = 
'F! 
ans = 
“TS 226." 
ans = 
"25-34! 
ans = 
‘writer' 
This user rated the following movies as high 
ans = 





















































1 2 
( 'Godfathe... 5 
2 ‘Annie Ha... 5 
3 ‘Duck Sou... 5 
w ‘Big Lebo... 5 
5 ‘Roger & ... 5 
© ‘Jungle B... 5 
u ‘Lady and... 5 
8 ‘Hard Day... 5 
) ‘Being Jo... 5 
uO ‘Blazing ... 5 
1 "Babe (1995)' 1 
We ‘Clueless... 1 
13 "Mr. Holl... 1 
Ue ‘Congo (1... 1 
is ‘Desperad... 1 
ue ‘Intervie... 1 

















and low 
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1 2 
We ‘Legends ... 1 
ue ‘Forrest ... 1 
i ‘Hot Shot... 1 
20 | 'Much Ado... 1 
User with the smallest (probably negative) component 
ans = i 
3610 
ans = 
™! 
ans = 
"30064' 
ans = 
'18-24' 
ans = 


"doctor/health care' 
This user rated the following movies as high 
ans = 20%*2 

































































1 2 
\ ‘Entrapme... 5 
a ‘Mummy, T... 5 
3 'Big Dadd... 5 
. 'Sixth Se... 5 
5 "13th War... 5 
6 "World Is... 5 
a 'End of D... 5 
8 'Gladiato... 5 
9 'Mission.... 5 
10 ‘Meet the... 5 
‘| "Babe (1995)' 1 
2 ‘Birdcage... 1 
ie ‘Star Tre... 1 
Ue "Philadel... 1 
us ‘Singin’ ... 1 
Ye Vertigo ... 1 
‘7 | "Rear Win... 1 
ue ‘North by... 1 
is ‘Some Lik... 1 
20 | 'Casablan... 1 
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Examining Movies 


In order to examine the movies, we will project the movies onto the principal components (columns of V) and 
then see which movies have particular high or low values for these projections (we call these the alpha values). 


alphas = ratingsMean0O'*V; 

Oi — lee Sa Zen(ahohas,.2) 
Chksp (i Gompomenen 7). mum siterar(Ga) al) e, 
getHighAndLowMovies (alphas(:,i), movies) 





end 


Component 1 



























































ans = 
1 2 3 
q ‘Fargo (1... | ‘Crime|Dr... 11.4741 
2 ‘Being Jo... | 'Comedy' 10.5606 
3 ‘Rushmore... | 'Comedy' 9.9952 
a 'Clockwor... | ‘Sci-Fi’ 9.5019 
5 ‘American... |'Comedy|D... 9.2726 
o ‘Pulp Fic... | ‘Crime|Drama' 8.5817 
a '2001:A... |'Drama|My... 8.4443 
8 ‘Raising... | 'Comedy' 8.2110 
9 ‘Election... ‘Comedy' 7.9125 
0 ‘Annie Ha... | 'Comedy|R... 7.7508 
t ‘Independ... | 'Action|S... -21.2742 
2 ‘Armagedd... | 'Action|A... -16.0816 
3 ‘Jurassic... | 'Action|A... -15.0277 
@ ‘Star War... | 'Action|A... -14.5359 
5 ‘Twister... | 'Action|A... -13.0293 
6 ‘Lost Wor... ‘Action|A... -12.2753 
if ‘Men in B... | 'Action|A... -11.6581 
8 ‘Rock, Th... | 'Action|A... -11.6275 
a | ‘Speed (1... | 'Action|R... -11.2455 
ap ‘Forrest... |'Comedy|R...) -10.9546 
Component 2 
ans = 
1 2 3 
1 ‘Batman &... | 'Action|A... 6.3047 
Z ‘Wild Wil... | 'Action|S... | 6.1325 
2 'Armagedd... | 'Action|A... 6.0882 
‘Judge Dr... | 'Action|A... 5.6803 
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1 2 3 
5 | ‘Saint, T... ‘Action|R... 5.6252 
6 'Entrapme... | 'Crime|Th... 5.4999 
a ‘Con Air... | 'Action|A... 5.1424 
© ‘Double J... | 'Action|T... 5.1340 
2 ‘Congo (1... | 'Action|A... 5.0835 

O ‘Gone in... | 'Action|C... 4.9682 
4 ‘Star War... | 'Action|A... -14.9568 
z ‘Back to... | 'Comedy|S... | -14.0561 
3 'ET.the... |'Children... --13.3320 
. 'Terminat... | 'Action|S... -12.8908 
5 ‘Star War... | 'Action|A... -12.5778 
= ‘Star War... | 'Action|A... -12.1434 
y 'Ghostbus... | 'Comedy|H... -11.9374 
8 'Terminat... | 'Action|S... | -10.8656 
2 ‘Jurassic... | 'Action|A... -10.8168 
ay ‘Jaws (1975)'| '‘Action|H... -10.7220 
Component 3 
ans 
1 D 3 
4 Titanic ... '‘Drama]Ro... 10.5506 
2 ‘Ghost (1... | 'Comedy|R... | 9.2843 
3 ‘Pretty W... | 'Comedy|R... | 9.2699 
4 ‘When Har... | 'Comedy|R... | 8.8200 
5 'Sleeples... | 'Comedy|R... 8.7931 
5 ‘Little M... ‘Animatio... 8.6716 
¢ ‘Beauty a... | 'Animatio... 8.3946 
8 ‘Aladdin... | ‘Animatio... 7.9341 
2 ‘Mary Pop... | ‘Children... 7.8153 
° "Sound of... | ‘Musical’ W217 
t ‘Starship... | 'Action|A... -9.9429 
| ‘Matrix, ... ‘Action|S... -9.3790 
3 | ‘Mars Att... | ‘Action|C... -8.5441 
4 |'Pulp Fic... ‘Crime|Dramat -8.0675 
5 ‘Fifth El... ‘Action|S... | -7.9597 
© ‘Austin P... | 'Comedy' -7.1214 























1 


2 

































































ay ‘From Dus... | 'Action|C... -7.0763 
us ‘South Pa... | 'Animatio... -6.9271 
ie 'Star War... | 'Action|A... -6.8836 
2p ‘Aliens (...__| ‘Action|S... -6.7933 
Component 4 
ans = 
1 2 3 
1 ‘Star Tre... ‘Action|A... 8.1841 
2 ‘Star Tre... ‘Action|A... 8.1733 
3 "Rocky Ho... | 'Comedy]H... 7.8815 
oy ‘Star Tre... ‘Action|A... 7.5446 
5 ‘Star Tre... ‘Action|A... 6.8142 
@ "Wizard o... | ‘Adventur... 6.4594 
a ‘Star Tre... | 'Action|A... 6.3536 
8 ‘Star Tre... ‘Action|A... 6.3154 
g ‘Star Tre... ‘Action|S... 6.2520 
o ‘Star Tre... | ‘Action|A... 5.8326 
t ‘There's... | 'Comedy' -11.8633 
z ‘American... |'Comedy|D... -11.3319 
3 ‘Forrest ... ‘Comedy|R... -11.1231 
e ‘Bravehea... | 'Action|D... -9.6422 
5 ‘Pulp Fic... |'Crime|Drama' _— -8.8487 
6 ‘Saving P... | 'Action|D... -8.7559 
i ‘Good Wil... | ‘Drama’ -8.2164 
8 ‘Matrix, ... ‘Action|S... -8.1028 
2 ‘American... | 'Comedy' -8.0819 
2 'Gladiato... | 'Action|D... -7.9168 








Some Prompts for Discussion and Next Steps 


1. What did you learn from the analysis? What are the limitations? 

2. What commonalities did you see when we did PCA on users versus on movies? 

3. What else might you do to understand this data (e.g., build on what's here or do something completely 
different)? 


function [movies, users, ratings] = filterOutRarities (movies, users, ratings, numMoviel 
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filterOutRarities Remove movies that don't have more than the specified number « 
Next, remove uses that don't have more than the specified number of 
IGENCALING IS 3 
movieMask = sum(~isnan(ratings)) > numMovieRatingsCutoff; 
userMask = sum(~isnan(ratings),2) > numUserRatingsCutoff; 
ratings = ratings (userMask, movieMask) ; 
movies = movies (movieMask,:); 
users = users (userMask,:); 
end 


AP AP al? 








function movieExtremes = getHighAndLowMovies(v, movies) 
% return a cell array with the most positive and most negative 
% components of the right singular vector v. 
nHighLow = 10; 












































movieExtremes = cell(nHighLow*2, 3); 
ie; imeiece's |) = sere (a); 
movieExtremes (1:nHighLow,1) = flip(movies (indices (end- (nHighLow-1):end),2)); 
movieExtremes (1:nHighLow,2) = flip(movies (indices (end- (nHighLow-1):end),3)); 
movieExtremes (1:nHighLow,3) = num2cell(flip(c(end- (nHighLow-1):end))); 
movieExtremes (1l+nHighLow:end,1) = movies (indices (1l:nHighLow) ,2); 
movieExtremes (1l+nHighLow:end,2) = movies (indices (1:nHighLow) ,3); 
movieExtremes (1l+nHighLow:end,3) = num2cell(c(1:nHighLow) ); 

end 

function userRatings = getHighAndLowUserRatings (userIndex, movies, ratings) 





% return a cell array with the most positive and most negative reviews 
% given by the specified user 

nHighLow = 10; 

userRatings = cell (nHighLow*2,2); 














[r, indices] = sort (ratings (userIndex,:)); 
% filter out NaNs 
indices = indices (~isnan(r)); 
© > E(~1snan (x) )); 
userRatings(1:nHighLow,1) = movies (indices (end- (nHighLow-1):end) ,2); 
userRatings(1l:nHighLow,2) = num2cell(r(end- (nHighLow-1):end)); 
userRatings (1l+nHighLow:end,1) = movies (indices (1:nHighLow) ,2); 
( 











userRatings (1+nHighLow:end, 2) num2cell(r(1l:nHighLow) ); 


end 








34 


Chapter 22 


Week 8b: Singular Value 
Decomposition (SVD) 


Schedule 





22.1 SVD - The Big Idea[go minutes] ... 1... 2.2... . eee ee ee ee eee 
22.2 SVD - Example [20 minutes]... 2... 6. ee ee eee 


22.3 SVD and User-Movie Data[10 mins] ........... 2.2.2.0. eee eee ence 


22.4 Preview of the Homework and Project [10 minutes].................. 





We previously met the Eigenvalue Decomposition (EVD), which we used on square matrices. There 
is no EVD for rectangular matrices, but there does exist a generalization known as the Singular Value 
Decomposition, which is one of the most useful matrix decompositions in applied linear algebra. In fact, we 
met the basic ingredients of the SVD in the previous class when we explored the user-movie rating data 
matrix. See the following webpage at the American Mathematical Society for a good geometric discussion of 


the SVD. 


22.1 SVD - The Big Idea [40 minutes] 


Rectangular matrices don’t have eigenvalues and eigenvectors. However, they have a generalisation of these 


known as singular values and singular vectors. 


» The singular values o; and singular vectors u;,v; of an n x m rectangular matrix A satisfy the 


definition 


Av; = o;,U; 


T 
A UU; = <JOiVi 


The singular vectors v; are known as the right singular vectors and the singular vectors u; are 


known as the left singular vectors. 


+ There are precisely r = min(n, m) non-zero singular values. The singular vectors v; are the eigenvec- 
tors of A’ A, and the singular vectors u; are the eigenvectors of AA’. The r non-zero eigenvalues 


of A7 A and AA? are o?. 
+ The n x m matrix A has a singular value decomposition (SVD) of the form 


A=UzvV! 


where U is an n x r orthogonal matrix whose columns are u;, 4 is an r x 7 diagonal matrix with r 
non-zero entries o;, and V is an m x r orthogonal matrix whose columns are v;. Please note that this 
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version of the SVD is called the reduced or economy SVD - there is a more general form but this is the 
most useful in a practical setting. 


Exercise 22.1 
. Read “The Big Idea” again! 


. Let’s assume that A is a3 x 2 matrix. What is the size of A? ? What is the size of v; and u;? 
What is the size of A7 A and AA? How many eigenvalues will A7 A have? How many 
eigenvalues will AA? have? What must be true about these eigenvalues according to “The 
Big Idea”? 


. Show that o? and v; are the eigenvalues and eigenvectors of A7 A by multiplying Equation 
(22.1) by A” and then using Equation (22.2) to simplify. 


. Show that co? and u; are the eigenvalues and eigenvectors of AA” by multiplying Equation 
(22.2) by A and then using Equation (22.1) to simplify. 


. Take the transpose of Equation (22.2) and justify the use of the term left singular vector for 
Uj. 


. Why is it valid to write 


Alvi...v,] = [ur...u,] 


and why does this imply Equation (22.3)? 


. Why does Equation (22.3) imply that 


T iT ch 
A =o UVvj + 02U2QVv5 +...+0,U,V;,. (22.4) 


. Ann X m matrix has nm data values. How many data values do you need to store a1, u;, and 
v1? What kind of compression ratio would you have if you only stored the first singular value 
and the first singular vectors? 





22.2 SVD - Example [20 minutes] 


Consider the rectangular matrix 


1 2 

3.4 

= 5 6 

7 8 
Since this matrix is 4 x 2 we will form the 2 x 2 matrix A7 A, 
rt, _ | 84 100 
oo fa 120 


The eigenvalues of A? A are 203.6071 and 0.3929 respectively. The singular values are the square roots of 
these, namely a; = 14.2691 and a2 = 0.6268 respectively. The associated eigenvectors are 


[0.6414 
V1 = | 0.7672 
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and 


_ [-0.7672 
v2 =| 0.6414 


so that the matrix © is 


> — [142691 0 
~ | 0 0.6268 


and the matrix V is 


v= ee nae 


0.7672 0.6414 


Please note that each of the columns of V could be multiplied by —1 - the eigenvectors are only unique up 
to their direction (and the opposite direction). 
To determine the U matrix, we recall that 


Av; = 0,U; 


which we can re-arrange and solve for u; 


1 
u; = — Av; 
4 


In this case uy is given by 





12 

ne: F Al 0.6414 

1 14.2691 |5 6| |0.7672 
7 8 


0.1525 
0.3499 
0.5474 
0.7448 


uy = 


and ug is given by 

1 2 
_ 1 [3 4| [-0.7672 
0.6268 |5 6 Raa 
7 8 


0.8227 

0.4214 

0.0201 
—0.3812 


ug 


U2 = 


so that the U matrix is 
0.1525 0.8227 


0.3499 0.4214 
0.5474 0.0201 
0.7448 —0.3812 


The original matrix A therefore has the SVD 


[otbes 0.8227 ] 
0.3499 0.4214 


ae 14.2691 0 | [0.6414 —0.7672]" 
= bee ant 0 0.6268 


0.7672 0.6414 
0.7448 —0.3812 


To check this we can define the matrix A in MATLAB and call the svd function with the "economy" option. 


>> A = [1 233 435 6;7 8]; 
>> [U,Sigma,V] = svd(A,'econ') 
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Exercise 22.2 
. Compare our result to the output from the use of the svd function and explain any discrepancies. 


. Confirm that the original matrix can be reconstituted by calculating 


OF IP 
O,U,Vy + 02U2V5 


. How good is the reconstruction if we only keep the first part? 


Exercise 22.3 


. Find the SVD of the following matrix by working through the steps outlined in this section. 
(You can use eig in MATLAB to get the relevant eigenvalues and eigenvectors.) 


3 2 Dy 
a =e 


. Now use the svd command in MATLAB and check your work makes sense. You will need to 
use the "economy" option. 





22.3. SVD and User-Movie Data [10 mins] 


22.4 Preview of the Homework and Project [10 minutes] 


Chapter 23 


Homework 8: Eigenfaces Paper, 
Project, and Cheat Sheet 





Contents 
ay.a Rigenfaces Paper... cc bse ee ee we Re eee Re ee ee Ew 222 
23.2 Beginning the Project ... 6.6. ee ee ee ete 223 
25.4 Cheat Sheet oo. a wh OS UNS RETR EEG EEA RENE SHES ES 223 





23.1 Eigenfaces Paper 


Check out Eigenfaces for Recognition, an early paper on Eigenfaces, by M. Turk and A. Pentland. You 
have most of the tools to understand this paper, but the writing style might be unfamiliar (intense!). We 
recommend that you take quite a bit of time reading through this paper (maybe about 3 hours). For some 
of you this may be the first time you are reading a technical research paper like this. The first 6 pages of 
this paper describe the use of Eigenfaces in face recognition. Check out other sources as well. Wikipedia is 
pretty useful for Eigenfaces, and this later paper talks about Eigenfaces and an extension called Fisherfaces 
(not fish faces). 


Some readings on reading (these are for your reading... no seriously, read these first). 


« Some pretty nice advice on how to read a paper that Michael Mitzenmacher users in one of his Harvard 
CS classes. 


» Another nice guide to reading research papers. This one is by Jennifer Raff, a Professor of Anthropology 
at the University of Kansas. 


« A tongue-in-cheek guide to reading a scientific paper (read this is you are feeling that you are the only 
one who is not capable of reading through the Eigenfaces paper and understanding it all at one go). 


Exercise 23.1 


We are asking you read this paper for several reasons. We hope that it highlights and synthesizes all 
the material you’ve learned in this module. It will also give you practice reading a technical paper, 
which is a skill you’ll continue to develop over your career. 


1. Summarize the paper using a method of your choosing (the readings above have some sugges- 
tions on what should be included in this). 


2. In what ways was your approach to implementing the Eigenfaces algorithm similar or different 
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from the authors’ approach? 


. In what ways did your understanding of the Eigenfaces algorithm change after reading the 
paper? 


. Were there places in the reading that you “got stuck?” If so, how did you address that? 


. What questions do you have after reading the paper? 





23.2 Beginning the Project 


In this project you will extend the work you have already done on using linear algebra to analyze data (e.g., 
for face recognition) and analyze the performance of an existing algorithm within a real context. We know 
that facial recognition and other applications of linear algebra to data can be incredibly powerful, but they 
are often prone to failure, and those failures can have very real consequences on people’s lives. In this project 
we are challenging you to think about linear-algebra based systems in a real-world context. To prepare for 
Tuesday’s in-class ideation activities, we ask you to do two things: 


1. Read the project description, which can be found in the next chapter (Chapter 24), and write down 
any questions you find yourself asking. Please ask us these questions (e.g., by posting in the General 
channel on Teams or by e-mailing the teaching team list)! 


2. Fill out this partner survey by 11:59pm on Sunday, November 1st (we will review the forms first thing 
Monday morning). We will let you know who your partner is before you arrive for class on Tuesday, 
November 3rd. 


23.3 Cheat Sheet 


There is a tradition in schools and colleges that students take exams. A "cheat sheet" can be a valuable tool 
for studying. We would like you to prepare a cheat sheet for the material we have "covered" so far this 
semester. If you don’t know what a cheat sheet is then please consult Wikipedia. Cheatsheet.com is also 
fascinating. 


Chapter 24 


Faces Project: The Context and 
Consequences of Linear Algebra in 
the Real World 


24.1 Overview 


This is a project that asks you to extend the work you have already done on facial recognition and feature 
detection by analyzing the performance and considering the consequences of an existing algorithm within a 
real(ish) context. This is a fairly new formulation of this project and we are giving you the freedom (and 
responsibility) to choose an interesting path and follow it judiciously. 
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The LinAlgCo owns the rights to all uses of linear algebra. They’ve recently become aware of the use 
of linear algebra in algorithms that are having profound (e.g., face detection, news recommendation) 
and not so profound impacts on our society. The company is concerned about the consequences 
these algorithms are having in the real world. They don’t want to tarnish the good name of linear 
algebra. You have been hired as a consultant to address these concerns. 

Specifically, they’ve asked you to do the following (note: this is very unlikely to be a process that you 
execute linearly. You will almost surely have to go back and revisit various steps as you learn more). 


1. Identify a specific context in which linear algebra is being used. Context can take into account 
both what task is being solved (e.g., face detection) as well as how it is being applied (e.g., what 
will the face detection results be used for, what data is being used to train the system, etc.). 


(Examples: Smile detection using linear regression, identifying missing persons from photos 
in social media, unlocking your phone with your face using Eigenfaces, classifying movie 
preferences...) 


. Pose a question about the consequences of using linear algebra in this particular context. 
Your investigations might center around the themes we touched on earlier (e.g., privacy, bias, 
misuse, reinforcing negative structures...) but instead you might want to understand how well 
a particular system, even one with no obvious intersection with the aforementioned issues, 
would work in a particular context. Both of these framings are okay for the project. 


. Answer some part of your question by analyzing the results of an algorithm that utilizes linear 
algebra. It can be one of the algorithms which we used earlier in this module (e.g., PCA or 
linear regression), or it can be something new that you will learn about on your own. You 
need to do some quantitative analysis, but the specifics are up to you. 


(Examples: How well does a smile detection system trained on the data we gave you in class 
work on data collected from your webcam? Does facial recognition have a higher accuracy 
with group X than group Y? How accurate would my chair detector be in a typical office 
environment? Are STEM documentaries more likely to be recommended to men than women?) 


Your consulting team is expected to produce a formal report, due to LinAlgCo by November 17th at 
10:00am. 





24.2 What we expect you to do 


1. Revisit the OPTICs activity that we did a few classes ago (here are the class OPTICs). Identify at least 
one OPTIC that you'd like to explore during your work on this project. 


2. Start with some background research on contexts for linear algebra and their associated impact when 
deployed in the world. This research will help you to choose what to focus on, and you should also 
reference this research when discussing the context of your project in the introduction of of your 
report. 


3. Choose a question related the real world impact of linear algebra in your chosen context (e.g., Who 
might this technology benefit / harm? Would this technology work at all?). This question should 
be rooted in a real context, but you need not answer it in full. Break off a small sub-question that 
you think you can answer in two weeks through an analytical approach that utilizes PCA, or another 
linear algebra-based algorithm. 


4. Plan, execute, and document some analysis (which could include modifying/creating an algorithm) to 
answer your sub-question. 


5. Explain the mathematical algorithm you are using in detail, explaining the various steps and what the 
purpose of each step of the process is. Use equations! 
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6. Explain how the results of your analysis inform the question you are trying to answer. Tie the results 
of your sub-question back into your larger question and chosen context. What can you conclude from 
the analysis you did? Recommend areas for future investigation. 


7. You should understand the metrics against which your programs should be measured. How do you 
characterize the accuracy of your approach? Against what should you compare this accuracy? How 
do you quantify the consequences of your approach? 


8. Communicate the context, analytical approach, and findings via a formal technical report to the 
LinAlgCo. 


24.3 Important Considerations and Frequently Asked Questions 


Does this project have to be about algorithmic bias? 


No, it doesn’t. A few classes ago we spent time discussing the potential impacts of facial recognition 
technologies in real world contexts (e.g., on different populations of folks). If you’d like, you can certainly 
try to investigate a question that intersects with this discussion (e.g., how accurate are face recognition 
algorithms on people of different skin tones?) . Instead, you may choose to create a technology that doesn’t 
have a plausible intersection with issues of algorithmic bias (e.g., counting whales in aerial photos). When 
choosing your question, you should be conscious of what feels like a good alignment between what you are 
interested in and what is a reasonable question for you to answer given your current skills and knowledge. 

You might find that you want to frame a big ethical question in your report, but you may only have 
time to answer a small subcomponent. This is okay. For example, you might talk about potential biases 
in automated essay grading systems, but perhaps your report itself will only touch upon estimating the 
accuracy of an automated essay grader in general (i-e., not broken down by subgroups of people). This is 
appropriate given that you are just learning this material. We also like to see the background research on 
the meatier ethical questions even if you can’t address them in full in your quantitative analysis. 


As a means of learning, can I create a system that I don’t think should be a thing 
that is deployed in the world? 


While this might not be a view shared by everyone in this class, our position is that implementing a 
potentially problematic technology can be a very effective means to understand the specific ways 
in which the technology might be problematic (however, see the next section for an important caveat). 

For example, if you really want to understand the perils of biases in age recognition technology, imple- 
menting your own system for age recognition might be a great learning experience. Further, there are many 
examples where creating a system can be used to expose bias in a data source. In such applications (e.g., this 
paper on understanding bias of human judges using machine learning), the goal of creating the system was 
never about deploying it in the world. 

This doesn’t mean that you should necessarily set out to study the most diabolical application of 
technology you can think of (please don’t do that). Instead, you might want to choose a technology (and a 
context for its application) where you are genuinely unsure of whether or not it’s a thing that should exist in 
the real world. You may gain more clarity on your beliefs about whether or not your chosen technology 
should be deployed in the real world as part of the project. 


What are my responsibilities to the QEA community in doing this project? 


We believe that QEA should be a place where learning can happen in a safe environment. As such, when 
doing your project you should be careful to think about your responsibility to the entire QEA community to 
maintain such an environment. For example, talking in demeaning ways about gender, age, religion, sex, 
race, or socioeconomic status (to name a few categories) is never okay. If you are investigating a particularly 
controversial topic, you want to be very cognizant of this. Because we don’t want you to think that nothing 
like this would ever happen at Olin, here are two specific examples of activities we’ve seen students try to 
do that we think are in violation of our collective responsibility to maintain a safe learning environment. 
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+ Students have taken pictures of other students faces without their consent and told them the race, 
gender, and age that their system predicted for them. 


+ Students have tried to compare the physical attractiveness of students at Olin to other colleges using 
face analysis. This was experienced as demeaning by some members of the class. 


Should I just play it safe and pick something that is so irrelevant that I can be 
sure not to offend anyone? 


No! We encourage you to approach this project with curiosity to learn important (and consequential) things. 
As long as you maintain sensitivity towards others and criticality towrards your own assumptions, you will 
be good. When in doubt, ask an instructor if what you are doing is okay. 


24.4 Resources 


1. Your existing eigenfaces algorithm or the example solution posted. Let us know if you need help 
getting eigenfaces working. 


2. The smile detection algorithm, which uses linear regression. Your version or the walkthrough from 
class can be modified to do something similar. 


3. Training and test images for your class and past QEA classes. 


4. The 10k faces database. This includes >2,000 images that have been classified in terms of demographics 
and other info (like whether people are facing the camera) and a software tool to narrow the database 
by classifiers (e.g., to only smiling men). The downside of this database is there is only one photo of 
each person. 


5. The internet. In addition to doing context/background research, you can go find a different algorithm 
or face database if you prefer, but be aware that this will take extra time! 


6. Your teaching team. Remember that we are here to support your learning! Bounce ideas off of us in 
office hours. Don’t let MATLAB get you down; ask for help early and often. 


24.5 Deliverables 


You need to produce a written report, but we’ve broken it into a few sub-assignments to keep you on track 
and create opportunities for feedback. 


1. Due Thursday 11/6: An informal document outlining your chosen context, big question, sub-question, 
and the algorithm you will use to answer your sub-question, plus a plan for what analysis you will do 
and what kind of results you will get. This document should serve as an outline for your final report. 
Since this is coming so soon after starting the project, some of this will probably change as you go 
into the heart of the project. 


2. Due Tuesday 11/10: An update to your document (you can append to the previous one). This document 
should list the big question, and sub-questions and the algorithm(s) you are using. In addition, you 
should describe the graphs you plan to include in your project, and provide evidence to concrete 
progress on your project (e.g. listings of code, graphs, a flowchart). 


3. Due Tuesday 11/17: A final version of your written report. 
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24.6 Project report 


You will generate a professional-looking and edited report to send to LinAlgCo that summarizes and justifies 
the decisions you have made. The executives at LinAlgCo are familiar with linear algebra and mathematics 
notation, but you should not assume they know anything about your particular linear algebra-based algorithm 
or the context in which you are studying it. 

The goal of the report should be to help LinAlgCo understand the context and consequences of the 
specific algorithm you have questioned and analyzed. You are NOT writing a story about what you did in 
the project. Aim for content and clarity, not length. 


Structure 


The report should have the following six sections (and you might want to break them into subsections): 


1. Summary 


What will I find in this report? 

Open with a one paragraph summary that orients LinAlgCo to what they will find in the report. It should 
make clear why the report was written, what each section will accomplish, and what the key insights and 
results are. You should also summarize your recommendations. 


2. Introduction 


What is this project? 

Your introduction should: (1) Provide the background and context for the algorithm and context that you 
have chosen to analyze. When and where is it used? By whom? What are the general technological or social 
issues associated with its use? (2) Explain your algorithm technically. How does it work? Bear in mind your 
audience. (3) Lay out the general ethical implications of the algorithm that you are investigating. Under 
what circumstances could the technology be helpful or harmful? Whom might it help or harm, and how? (4) 
Clearly state, within the broader ethical context, what question or issue you are exploring and and what 
sub-question you are quantitatively investigating. 


3. Methods 


How does your approach work, and what did you do with it? 

Having introduced the reader to terminology and ideas, this section should lay out the approaches you 
are using, both in terms of the chosen algorithm and the analysis you are doing with it. Use equations and 
define all variables. 


4. Detailed Findings 


What are the main results and consequences of your work? 

This section should contain your main results and consequences of your work which you have quantified. 
This section should contain some clear, informative, labeled, and captioned plots and images that demonstrate 
your findings. Quantitative results should be clearly connected to the context of the investigation. Why are 
your findings meaningful? Reflect on the downsides of the technology and the people it could hurt, and 
suggest some strategies for improvement. 


5. Recommendations 


What are the key takeaways? Summarize the key findings of the report, situate them in the greater context, 
and identify areas for future investigation. This section should be concise—they details go in the previous 
section. 
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6. References 


Provide full citations for sources referenced in the paper. Format doesn’t matter here as long as you provide 
sufficient information about each of your sources. 


24.7 Grading rubric 
1 pt. Summary presents a clear, high-level overview of paper. 


3 pts. Question being investigated is clearly rooted in a real context, as discussed in the introduction and 
justified with references. Discussion of potential for harm is thorough. 


2 pts. Algorithm is clearly explained using equations and words. 
2 pts. Analytical approach is clearly explained. 
2 pts. Findings are justified with appropriate figures and discussion. 


2 pts. A clear connection is drawn between the findings and the original sub-question, greater question/issue, 
and greater context. 


2 pts. Paper is logically organized and writing, figures, and equations are polished. 
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Schedule 
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25.1 Project Ideation [80 mins] 


User Ideation Extravaganza 


There are A LOT of possible questions you could propose for this project. In the project document we prompt 
you to choose an important question related to feature recognition, detection, or classification. This should be 
rooted in a real context, and you will likely not be able to answer it entirely. Break off a small subquestion 
that you think you can answer in a short project through an analytical approach that utilizes eigenfaces, 
another facial recognition algorithm, or linear regression. We recognize that this is a very open ended and 
somewhat ambiguous prompt, but we know you are up to the challenge! We will be taking the remainder of 
class to generate lots of possible questions, refine ideas, and flesh out possible directions. 


Exercise 25.1 


1. Do this part individually [10 minutes]: Write down (on Miro) as many different ideas for 
possible questions related to linear algebra in the real world as you can on different sticky 
notes. Go wild! 


. Do this part with your breakout room [10 minutes]: Group all of the relevant questions together 
and label this group (e.g., all of the questions around flagging and the potential bias in that 
process could go into one group). Draw on the board as needed. 


. Do this part with your breakout room [10 minutes]: Look at other boards, reading the idea 
clusters from other students in the class. 





Pair Project Ideation 


This next stage is going to give you an opportunity to work with your partner to develop a complete project 
idea. 
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Exercise 25.2 


1. With your partner, select a question (or group of questions) from the boards. Copy the 
appropriate sticky notes and bring them to your board. [10 minutes] 


. Collaboratively develop an idea that identifies a question, it’s context, and how you could 
perform some analysis. Do some internet research to find out more about your question and 
its context. What are the ethical issues associated with your topic? What is known and what 
is still in question? Fill out the project pitch handout (since we are doing this electronically, 
it might make sense to create a Google doc to fill out the answer to these questions. You are 
welcome to use whichever digital collaboration tool you'd like). You will need to define the 
question itself, the real world context, and details about the critical concepts from this module 
that will allow you to perform the desired analysis. [20 minutes] 





Project Worktime (if time) 


Get started on the details of your project. You should consider this an extension of the project ideation 
time. Play with ideas, and hopefully, by the end of class you'll feel like you've settled on an idea and have a 
direction to go in. Use this time to chat with the faculty. 


Chapter 26 


Night 9 


Project work time. Remember that the outline of your report is due Monday. You should also get started on 
the analysis before Monday, but you don’t have to turn in any results yet. The last few sections can simply 
be an outline of what you will do. 
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Day 10 


Project work time. 
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Part II 


The Design and Stability of Boats: 
Multivariable Calculus and Mechanics 
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Boats Week 1a: Goodbye Faces, Hello 
Boats 





Schedule 
28.1 Sharing your project [60 mins] .. 1... 0... ee eee 235 
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28.1 Sharing your project [60 mins] 


We'd like you to take some time to share your work with others in the class. In particular, we'd like you 
to share what you worked on, why you worked on it, how your approach worked (or didn’t), and what you 
learned from the process. 

To do this, we’re going to create breakout rooms consisting of 6 teams each. Each team will have less 
than 10 minutes to share their project and answer questions. It’s up to you how you do this but here are a 
few suggestions: 


+ Don’t try to share everything - focus on the key elements. 
« Just talking is absolutely fine, but it probably makes sense to "share screen" something, e.g. key figure. 


+ It’s fine to share parts of your report - just don’t try to read through the whole thing - focus instead 
on the key pieces. 


+ It’s fine to put a few slides together, but you don’t have to - if you do please keep them to a minimum. 


28.2 Reflection [30 mins] 


The instructors will take a few minutes to each share reflections about the course so far. There will be an 
opportunity for you to provide your own thoughts. 
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Boats Week 1b: Rapid Boat Build 





Schedule 
29.1 Fabrication Requirements ... 1... 2 ee ee ee 236 
29.2 Performance Requirements ... 2... 0... cee ee ee ees 236 
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We're going to launch today with a short design challenge that will form the basis of our work for this 
module. Over the course of the next 90 minutes, you will work with a partner to design and fabricate a boat 
out of cardboard. The requirements for the boat are listed below. You are welcome to use any resources 
(internet, CAD, etc.) available to you, within the provided time constraints. 

You must test your own boat for performance, but you may only test it once! 


29.1 Fabrication Requirements 


1. You may only make your boat out of the provided materials: cardboard, ballast, and adhesives. 
2. Your boat must be a keel-less, monohull design. 

3. Cardboard can be cut using scissors or exacto blades. Please be careful! 

4. Cardboard can be secured using staples, tape, tacks, glue...whatever you think will hold. 

5. Cardboard can be layered: it is acceptable to create a boat using multiple pieces of board. 

6. You should make your boat as water-tight as possible! 


7. The boat must accept a payload of approximately 720 g. The payload has the dimensions of two 12 oz. 
soda cans and can be placed at the location of your choice. 


29.2 Performance Requirements 


1. The boat must float when fully loaded. 


2. The deck of the boat should be within 5° of parallel to the surface of the water when the boat is fully 
loaded. 


3. The angle of vanishing stability (AVS) for the fully loaded boat should be between 120 degrees and 140 
degrees. To measure the AVS you should place your boat in the water and start to rotate it. Initially it 
should "feel" as if the boat wants to rotate back to level. As you keep rotating you will find the point 
at which it doesn’t want to rotate back - this is the AVS. 
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29.3 Testing Procedure 
Note: we do not expect that you will be able to complete the build and do the test all during 
class. While you must complete the build in the 90 minutes, the test can be done after class. 


From Thursday morning’s class on 11/19 until Monday morning’s class on 11/23 there will be two kiddie 
pools setup for you to test your boat (see below for a picture of one of the pools). 





Due to COVID restrictions, the two pools will be outside of the MAC, under the overhands. The pools 
will be located on either side of the main entrance to the MAC (the main entrance being the one right across 
from the elevators). 

In order to test your boat, perform the following steps (be careful to follow appropriate COVID protocols 
since your classmate’s from other household groups will probably be testing around the same time). 


1. Place your boat in the water. Does the boat float? Take a picture to show the result of this test. 


2. Does the boat float flat? The deck of your boat should be within 5° of parallel to the surface of the 
water. For this exercise it’s not important to get a super-precise measurement of this. You should be 
able to tell, by eye, whether or not this requirement has been met. 


3. Measure your boat’s AVS. As stated before, to measure the AVS you should place your boat in the 
water and start to rotate it. Initially it should "feel" as if the boat wants to rotate back to level. As you 
keep rotating you will find the point at which it doesn’t want to rotate back - this is the AVS. Jot down 
a rough estimate of the AVS for your boat (again, a super precise measurement is not necessary) and 
snap a picture of your boat rotated to its AVS. 


29.4 Deliverable 


After the testing, we will ask you to take time to make a record of this activity and upload this to Canvas. 
This should be a pdf file that includes: 


1. A picture of your boat 
2. A record of the results of your boat testing. 


3. A brief description of your process: What did you do? What were your thoughts as you attempted 
this? What were the important considerations? 
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Homework 1: Curves and Surfaces 
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? Learning Objectives 


Concepts 
- Distinguish between equations that represent explicit functions, implicit functions, and 
parametric functions. 


+ Identify exponential, polynomial and trigonometric relationships by the shape of their 
curves. 


« Describe how changes in parameters affect the shape of curves or surfaces. 


+ Determine a mathematical approximation to the surface of real physical object. 
MATLAB Skills 


« Use MATLAB to define and visualize curves and surfaces defined by explicit functions, 
implicit functions, and parametric functions. These MATLAB functions are plot, plot3, 
contour, surf, isosurface. 


30.1 Curves 


30.1.1 Curves defined Explicitly 


If you recall, single-variable calculus involved explicit functions of a single variable, e.g. y = t?, y = sin(t), 
or more generally y = f(t). You spent a lot of time visualizing these functions, solving equations with these 
functions, and computing related properties like derivatives and integrals. 
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Let’s consider the function, y = mt + b, where m and b are parameters. We probably recognise this 
function, and that it’s graph is a straight-line with slope m and intercept b. 
In MATLAB we can visualize this function using the plot function which you are already familiar with. 


>> m= 23 

>> b= 1; 

>> t linspace(-10,10,1000) ; 
>> y = m.*t+b; 

>> plot(t,y, 'red') 


First, we define a value of m and a value of b as an example. Second, we use the function linspace to 
generate 1000 equally-spaced points between -10 and +10. There is nothing special about this domain, except 
that the resulting graph captures the behavior of the function. Third, we evaluate the mathematical function 
at these points and store the result in y. Fourth, we call the plot function to generate the curve — we use red 
in this case because it looks great! Hopefully we recognize the classic straight-line which has the following 
features: 


« ytends to too ast > too ifm > 0. 








« y tends to -oo ast > too ifm < 0. 
¢ The line passes through the point (0, b). 


One way to capture these different behaviors is to sketch sample curves in each quadrant of a b — m space. 





Figure 30.1: Straight-lines with different slopes and intercepts. 


Exercise 30.1 


Consider the exponential function y = Ae** where A and k are parameters. 


1. What is the value of y when t = 0? 





2. What happens to y as t — +00? How does this limiting behavior depend on the sign of A and 
k? 


3. Now sketch examples of the curves in the four quadrants of the A — k space. 
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4. What is the effect of the parameters A and k on the curve? 


Exercise 30.2 


Consider the logistic function y = A/(1 + e~**) where A and k are parameters. 


1. What is the value of y when t = 0? 





. What happens to y as t + +00? How does this limiting behavior depend on the sign of k? 
. Now sketch examples of the curves in the four quadrants of the A — k space. 


. What is the effect of the parameters A and k on the curve? 


Exercise 30.3 


Consider the trigonometric function y = Asin(wt + ¢) with parameters A, w, and ¢. 
1. Sketch some representative examples of these curves for different values of the parameters. 


2. What features of the curve do A, w, and ¢ control? Use the internet to deepen your under- 
standing of these parameters. 


Exercise 30.4 


Consider the quadratic polynomial in vertex form y = g(x — h)? + k, with parameters g,h,and k. 
1. Sketch some representative curves for different parameter values. 
2. What features of the curve do g, h, and k control? 


3. What is the relationship between g,h, and k in the vertex form and a, b, and c in the standard 
form y = c+ bx + ax? (This will require some algebra.) 


The quadratic polynomial is probably very familiar to you. There are numerous ways to write this 
second-order polynomial, and we’ve used two forms here: the standard form and the vertex form. It is 
hard to tell the effect of each parameter in standard form. Using the vertex form, however, the effect of 
each parameter is much easier to interpret. 
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30.1.2 Curves defined Implicitly 


Not every curve can be expressed in terms of an explicit function in which there is only one output for each 
value of the input. Curves can also be expressed implicitly through a relationship between 2 variables. A 
circle is a good example. For example, the equation for a circle of radius 1, centered at the origin, is 


g?+y?—-1=0 (30.1) 


The left-hand side of this equation can be thought of as a function of two variables, f(x,y) =a? + y?—1 
and the set of points (x, y) where f = 0 defines a curve that we like to call the unit circle. 

In order to visualize such curves in MATLAB we use the contour function. We begin by defining a grid 
of (x,y) points using the meshgrid function 


>> [x,y]=meshgrid(linspace(-2,2,100),linspace(-3,3,200)); 


You will notice that both x and y are 200 x 100 matrices. There are 200 rows corresponding to the 200 
y-values between -3 and 3. There are 100 columns corresponding to the 100 x-values between -2 and 2. There 
is nothing special about the limits of the domain or the number of points in each direction - we chose values 
here that would help explain the size of the resulting matrices. 

Now that we have the grid defined, we compute the value of the function f at every point. Since x and y 
are already matrices we can use 


>> Ff = x.A2 + y.A2 - 1; 


Notice that we use the . operator because we want every entry in the x matrix to be squared, and similarly 
for y. You will also notice in MATLAB that f is a 200 x 100 matrix. In theory, subtracting "1" (a scalar) from 
a matrix should not be permitted, but the good people at MathWorks have decided to interpret this for us 
automatically. 

To plot the curve we now use the contour function 


>> contour(x,y,f,[0 0]) 
>> axis equal 


which should produce a circle of radius 1 centered at the origin. The last argument to the contour function 
tells it to draw the contour at f = 0. Don’t ask why you have to put two zeros instead of just one because 
only MATLAB knows. Without the "axis equal" the curve would look like a ellipse due to the different 
scaling MATLAB will use in the x and y directions. 

There is no end to the functions of two variables that you can define. There is, however, a set of functions 
that show up again and again, and these are the quadratic functions of two variables. The general form 
(containing all possible quadratic, linear and constant terms) is 


az? + bry + cy? +dx+ey+ f =0 (30.2) 


where a, b,c, d,e, f are arbitrary parameters, some of which may be zero. The curves defined by this equation 
are called conic sections, and represent the intersection of a double cone and a plane. The non-degenerate 
cases include circles, parabolas, ellipses, and hyperbolas. See the Wikipedia article on conic section for more 
information. 


Exercise 30.5 


Use the internet to find the implicit equation for a circle of radius R, centered at the point (a, b). 
Visualize the circle in MATLAB for different values of a, b, R. 

This is a warm-up question. The implicit equation for a circle centered away from the origin should 
be easy to find, and you should use the visualization to check that changing the parameters moves the 
circle and changes its radius in the way you expect. 
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Exercise 30.6 


Visualize an ellipse using the implicit definition 


qe 2 
Signin (30.4) 


for different values of a and b. What features of the ellipse do a and 6 control? Use the internet to 
deepen your understanding of the parameters a and b—see for example the Wikipedia article on 
conic section. 

This question requires a little modification to the visualization for the circle, and a little internet research 
to fully understand the parameters. Start with the article on conic section, and then spend a little time 
exploring after that - don’t get lost in the world of the internet, and don’t be surprised when you see lots 
of terminology that you don’t understand. 


Exercise 30.7 


Visualize an hyperbola using the implicit definition 


(30.5) 


for different values of a > 0 and b > 0. What features of the hyperbola do a and b control? Use the 
internet to deepen your understanding of the parameters a and b. 

This question is similar to the one for the ellipse. In this case, however, interpreting the parameters 
without some additional reading is much harder because the precise impact of the parameters is not 
obvious from visualization. Again, start with the Wikipedia article on conic section and take it from 
there. 


Exercise 30.8 


Use the internet to find the conditions under which the solutions of 





ax’ + bry + cy? + dx+ey+f=0 (30.6) 
define an ellipse, a parabola, a hyperbola, and a circle. 


This question is meant to broaden your understanding of the possible solutions of this general quadratic 
polynomial in two variables. Start with the Wikipedia article on conic section. 


CS) 


30.1.3 Curves defined Parametrically 





A more general representation of a curve involves expressing it’s coordinates in terms of another independent 
variable or parameter as follows 


r= f(u),y=glu)ue [a, b] (30.7) 
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Each value of u defines a point with coordinates (f(w), g(u)). If we collect all the points defined by u ina 
specific interval, then we get a parametric curve. For example, the definition 


x = cos(u),y = sin(u),u € [0, 27] (30.8) 


defines a unit circle centered at the origin which begins and ends at (1,0) and is traced out counterclockwise 
as u increases from 0 to 27. The parameter u can therefore be thought of as the angle from the x-axis to the 
current point on the circle. 

To visualize parametric curves in MATLAB we still use the plot function as follows 


>> u = Linspace(0,2*pi,100); 
>> xX cos(u); 

>> y sin(u); 

>> plot(x,y,'*') 

>> axis equal 


We first define a set of u points on the interval [0, 27]. We then compute the x and y coordinates for every 
value of wu. We finally plot the points, using an * for clarity and an “axis equal" so that we recognise the circle. 
How do we "know" that these parametric equations trace out a circle, and not just a curve that looks 
like a circle? Let’s check by substituting the definition of x and y into the equation for a circle of radius 1, 

centered at the origin, 
x? +y? —1=cos*(u) + sin?(u) —1=0 (30.9) 


which required the use of the trigonometric identity cos?(u) + sin?(u) = 1. 


Exercise 30.9 


Use the internet to find a set of parametric equations that define an ellipse, and use MATLAB to 
verify them visually. Show that the parametric equations satisfy the implicit equation for an ellipse. 
This is a small change to the parametric equations for a circle, and finding parametric equations on the 
internet should be straight-forward - try searching on "parametric equations for ellipse" or start with the 
Wikipedia page on conic section or ellipse. 


Exercise 30.10 


A logarithmic spiral can be defined by the parametric equations 


—b 


z = ae cos(u), y = ae” 


“sin(u),a > 0,b > 0,u € [0, 00) (30.12) 
Visualize the curve in MATLAB for different values of a and b—you won’t be able to define an infinite 
domain but you can define a large one. How does a and b change the curve? 

This question involves a curve that has been of interest to mathematicians and scientists for many 
centuries. Try searching on the internet for the term "logarithmic spiral". 


Exercise 30.11 


A helix in 3D can be defined by the parametric equations 


x = acos(u), y = asin(u), z = bu,a>0,b>0,u>0 
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Visualize this curve for different values of a and b. How do a and b change the curve? (You will need 
to use plot3 in MATLAB) 
This question demonstrates that it is relatively simple to define a curve in 3D - just define the x, y, and z 


coordinates in terms of a single parameter. This curve is a good example, and has been widely studied 
in modern biology given its connection to the shape of DNA. Use the Wikipedia article on "helix" as a 
starting point. 


ey) 


30.1.4 Data-Driven Curves 





We are often tasked with finding a curve that fits a set of data. You’ve probably seen informal approaches to 
this, particularly when finding the best-fit straight-line to a set of data points. Fortunately we have a robust, 
formal tool at our disposal now - orthogonal projection, often known as linear regression in this context. 

Let’s start with some data. Consider the 4 points (0, 1), (1, 0), (3, 2), (5, 4). We can use MATLAB to plot 
these points 


>> x = [0 1 3 5]'; 
>> y = [1 0 2 4]'; 
>> plot(x,y,'*') 


Notice that we placed all of the x-coordinates in a column vector x and all of the y-coordinates in a column 
vector y. Let’s now find the best-fit straight-line through these points, ie. let’s find the parameters m and b 
so that the straight-line defined by 

y=mae+b (30.14) 


fits the points as well as possible. 
The approach we take is motivated by our work in linear algebra. If we pack all of the x-coordinates into 
a vector x and all of the y-coordinates into a vector y then we would like to satisfy the vector equation 


y=mx+b (30.15) 


as well as we can. Notice that there are 4 equations here (one for each point) and only two unknown 
parameters. An exact solution is impossible (unless the points happen to lie on a line) and so we use 
orthogonal projection to find the best solution. Let’s define a matrix A and parameter vector p so that the 
vector equation for a straight-line becomes 


Ap=y (30.16) 
where A and p are given by 
m 
AE: p= (5) 


Notice that there is a coefficient of "1" in front of the "b" term so we had to create a column vector and fill it 
with 1’s. 
Recall that to find the best solution we multiply by A’, 


A’Ap=A’‘y (30.17) 
and solve this linear system for p. 


>> A = [x ones(4,1)] 
>> p = A'*A\A'*y 


For these data points we should find that m = p(1) = 0.6949 and b = p(2) = 0.1864. You should plot the 
straight-line defined by this slope and intercept to see how good the fit is. 
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Exercise 30.12 


Find the best-fit parabola for these 4 data points. Recall that a parabola can be defined using the 
explicit function y = ax? + br +c. 


SSSI a) 


30.2 Surfaces 





30.2.1 Surfaces defined Explicitly 


If we assign the output of a function of two variables f(x, y) to be a third variable, z = f(x, y), then the 
set of points in 3D define a surface. For example, z = x? + y? defines a paraboloid. This surface can be 
visualized in MATLAB using the surf function. 


>> [x,y]=meshgrid(linspace(-2,2,100),linspace(-2,2,100)); 
>> Z = X.A2 + y.A2; 

>> surf(x,y,Z) 

>> shading interp 


First we lay down a grid of points in the xy-plane using meshgrid. Next we compute the value of the 
function at each of these points and assign the value to z. Finally we pass the x, y, z matrices to surf for 
rendering—we include a shading option to make the surface look nice and smooth. 

It is often helpful to visualize a surface by drawing the contours defined by holding one of the variables 
constant. For example, if we define z = 1 in the equation for the paraboloid we obtain x? + y? = 1, which 
we know to be the equation of a circle of radius 1, centered at the origin. Choosing different values of z will 
define circles of radius \/z. We already used the contour function in MATLAB earlier—here we will use it 
to draw the contours at different values of z 


>> contour(x,y,Zz, 'ShowText','On"') 
>> axis equal 


In this case we are allowing MATLAB to pick the contour levels and we are including labels on the contours 
to show the corresponding value of z. We include the "axis equal" option in order to recognise that the 
contours are circles. 

We can also "slice" the surface along the different coordinate directions. For example, if we wanted to 
plot the contours in the yz-plane where x is constant we would use 


>> contour(y,z,x, 'ShowText','On') 
If we define x = c and replace it into the definition of the function we see that 
z=yt+e (30.18) 


which is the equation of a parabola in the yz-plane and the value of c controls where it crosses the z-axis 
(y = 0). The contour plot should support that analysis. We could also view the constant y contours in the 
xz-plane and we would find parabolas again—thus the reason we refer to the surface as a paraboloid. 


Exercise 30.13 


Visualize the elliptic paraboloid z = x”/a* + y”/b for different values of a > 0 and b > 0. 


1. Describe the contours in the yz-plane defined by x = c. 
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2. Describe the contours in the xz-plane defined by y = c. 


3. Describe the contours in the xy-plane defined by z = c. 


This question requires you to combine surface visualization with the curve visualization that we met 
earlier. To fully understand the parameters you should try to explain why the surface is called an elliptic 
paraboloid. 


rrr) 


30.2.2 Surfaces defined Implicitly 





A surface in three dimensions can also be implicitly defined by a function of three variables. For example, 
the equation for a unit sphere centered at the origin is 


ety? +27-1=0 (30.19) 


The left hand side of this equation can be thought of as a function of three variables, f(x, y, z), and the set 
of points where f = 0 defines the unit sphere. We can use the isosurface function in MATLAB to visualize: 


>> [x,y,z] = meshgrid(linspace(-2,2,100),linspace(-2,2,100),linspace(-2,2,100)); 
>> fF = x.A2 + y.42 4+ 7.42 - 1; 

>> isosurface(x,y,z,f,0) 

>> axis equal 


We first define a set of points in 3D space using the meshgrid function. Next we evaluate the function f at 
all of these points. We then use isosurface to render the surface defined by f = 0, and we use the "axis 
equal" option so that the resulting looks like a sphere. 

There are lots of implicit surfaces, but a particularly important group is the quadratic (or quadric) surfaces, 
defined by the equation: 


Ax? + By? + Cz? 4+ Dyz+ Ezx+ Fry + Gr+ Hy+Iz+J=0 (30.20) 


where A, B, C, D, E, F, G, H, I, and J are all arbitrary constants, some of which may be zero. 


Exercise 30.14 
Visualize the hyperboloid of one sheet defined by 


gia +g /h— 2 /e —1=0 





for different values of a, b,c. What features of the hyperboloid do a, b, c control? 





30.2.3. Surfaces defined Parametrically 


Finally, a more general representation of a surface involves expressing it’s coordinates in terms of two 
independent variables as follows 


«= f(u,v),y=g(u,v),2=h(u,v),u € [a,b], 0 € le, d] (30.22) 


Each value of (u,v) defines a point in 3D with coordinates (f(u, v), g(u,v), h(u, v)). If we collect all the 
points defined by (wu, v) in the specified domain, then we get a parametric surface. For example, the definition 


x = sin(u)cos(v), y = sin(u)sin(v), z = cos(u),u € [0,7], u € [0, 27] (30.23) 


defines a unit sphere. In MATLAB we visualize a parametric surface using surf. 
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>> [u,v] = meshgrid(linspace(0,pi,100),linspace(0,2*pi,100)); 
>> X = Sin(u).*cos(Vv); 

>> y sin(u).*sin(v); 

>> Z cos(u); 

>> surf(x,y,z), shading interp 

>> axis equal 


First we lay down a grid of points in the (u,v) space using meshgrid. We then compute 2, y, z at each of 
these points, and we render the surface using surf. 


Exercise 30.15 


Lookup the parametric equations that define an ellipsoid, and use MATLAB to visualize. 


Exercise 30.16 


Visualize the following parametric surface 
x = (a+rcos(u)) cos(v), y = (a+rcos(w)) sin(v), z = rsin(u) (30.24) 


with r < aand u € [0, 27], v € [0, 27]. Describe the surface and interpret the parameters a and r. 





30.3 Designing Curves and Surfaces 


Exercise 30.17 


1. Pick a fruit or vegetable. Sketch it on paper from a variety of viewpoints. Now slice it in three 
ways, and sketch the sets of curves defined by each of these sets of slices. 


2. Propose and evaluate a mathematical representation that is a good approximation to your fruit 
or vegetable. You could represent the entire surface, or you could design a set of curves that 
are good approximations to the slices. 
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Solution 30.1 
1. The value of y at t = Ois A. 


2. For negative values of ¢ the value of y tends to zero if k > 0 and it tends to +oo(A > 0) or 
—oo(A < 0) if k < 0. For positive values of t the value of y tends to zero if k < 0 and it tends 
to +00(A > 0) or —-co(A < 0) ifk > 0. 


4. The sign of the parameter A dictates whether the curve has positive or negative values of y — 
the curve also passes through the point (0, A). The parameter k dictates whether the curve 
increases or decreases exponentially. 


Solution 30.2 
1. The value of y at t = Ois A/2. 


2. For negative values of t the value of y tends to zero if k > 0 and it tends to A if k < 0. For 
positive values of t the value of y tends to A if k > 0 and it tends to zero if k < 0. 


4. The parameter A changes the long-term behavior of the curve while the parameter k changes 
how quickly the curve tends to this value. 


Solution 30.3 


2. Since a sin function returns values between o and 1, the parameter A controls the height of 
the function and is usually referred to as the amplitude. Since a sin function is periodic with a 
period of 27 , the period T of this function is determined by wT’ = 27. Increasing w decreases 
the period T , and w is usually referred to as the angular frequency. Since a sin function is 0 
when it’s argument is 0, the parameter ¢ controls where it crosses the x-axis, and is usually 
referred to as the phase. 


Solution 30.4 


2. Graphing the vertex form reveals the effects of the parameters as follows. The vertex of the 
parabola is located at (h, k). The parabola opens upward if g > 0 and downward if g < 0. 
The parabola is narrow and steep for large positive values of g or large negative values of g. 
Changing h and k simply changes the location of the vertex. 


3. Expanding the vertex form of the polynomial leads to gx? — 2ghx + gh? + k. Comparing to 
the standard form we see that a = g, b = —2gh, c = gh? +k. So although a has the same 
effect as g, the parameter b depends on g and h, and the parameter c depends on g, h, and k. 
This is why it is difficult to see the effect of the standard-form parameters on the curve. 


Solution 30.5 


The equation for a circle of radius R, centered at (a, b) is given by 


(x — a)? + (y — 6)? = R? (30.3) 
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Solution 30.6 


The parameters a and b determine the axes of the ellipse. The larger one is usually called the 
major axis and the smaller one is usually called the minor axis. This ellipse is oriented with its 
major and minor axes along the coordinate axes. Increasing a while holding 6 fixed results in a 
vertically-squished ellipse and vice versa. 


Solution 30.7 


There are two curves that define the hyperbola. Notice that the curves cross the x-axis at = —a 
and 7 = a respectively. The rest of each curve is unbounded, but is asymptotic to the straight lines 


y = (b/a)x and y = —(b/a)a. 


Solution 30.8 
The type of conic section is determined by the value of b? — 4ac as follows: 


- If b? — 4ac < 0 the equation represents an ellipse. In addition, if a = c and b = 0 the equation 
represents a circle. 


- If b? — dac = 0 the equation represents a parabola. 


- If b? — 4ac > 0 the equation represents a hyperbola. 


Solution 30.9 


Although there are lots of parametric equations that trace out an ellipse, the most common are closely 
related to those for a circle and take the form 


x =acosu,y = bsinu, u € (0, 27] (30.10) 


where a and b represent the ellipses major and minor axes. The ellipse is traced out as u changes 
from Oto27, but note that u does not represent the angle between the x-axis and a point on the 
ellipse - see the Wikipedia page on "Ellipse" for an explanation of this. To confirm that these are 
valid parametric equations for an ellipse we substitute them into the implicit equation for an ellipse 


a? 
— += —1=cos?u+sin?u—1=0 (30.11) 
a b? 


where again we have used the trigonometric identity cos?(w) + sin?(w) = 1. 


Solution 30.10 


If b = 0 we see that the curve is a circle of radius a, and the parameter u corresponds to the angle of 
rotation. As you increase b, the circle changes into a spiral which tends to the origin as u — oo—the 
larger the value of b the quicker the curve spirals into the origin. 


Solution 30.11 


If b = 0 the curve is a circle of radius a in the x y - plane, and wu is the angle of rotation. For b > 0 
the curve continues to rotate as before when viewed from "above", but its height increases linearly - 
the resulting curve is a helix. The separation between each rotation of the curve is given by 27b, 
which is commonly known as the pitch of the helix. 


Solution 30.12 


Assuming we have already packed the x-coordinates of the data into x and the y-coordinates of the 
data into y we need to define the matrix A and solve for a vector p of unknown parameters. In 
MATLAB we would use 


>> A [x.42 x ones(4,1)] 
>> p = A'*A\A'*y 
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We should find that a = p(1) = 0.1910, b = p(2) = —0.2663 and c = p(3) = 0.6784. You should 
graph the parabola defined by these parameters and see how good the fit is. 


Solution 30.13 


Let’s take slices through the surface along each of the coordinate axes. 


1. If we choose x = c then we obtain z = c?/a? + y?/b? which is a parabola in the (y, z)-plane 
that crosses the z-axis (y = 0) at c?/a?. 





2. If we choose y = c then we obtain z = x?/a? + c?/b? which is a parabola in the («, z)-plane 
that crosses the z-axis (x = 0) at c?/b?. 


3. If we choose z = c then we obtain c = x?/a? + y?/b? which is an ellipse in the (x,y)-plane. If 
we divide both sides by c we get the standard form for an ellipse 1 = 2?/(ay/c)? + y?/(by/c)? 
so that the major and minor axes are a/c and b,/c - increasing the value of c increases the 
axes of the ellipse. 
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31.1 Mathematical Representation of a Curved Surface [90 mins] 


In the homework assignment we learned about describing, visualizing, and working with curves and surfaces 
in different ways. In this activity, we are going to apply what we learned and develop a mathematical 
representation of the hull of a specific boat. The boat we are going to model is called the Spray. 

The Spray was used by Joshua Slocum in 1895 when he single-handedly sailed around the world. There 
has been much debate about the seaworthiness of the Spray. On the next page are the boat lines for the 
Spray. You'll have to take some time to understand these because there is a lot going on. 

Once you think you’ve got it figured out, we'd like you to build up a representation of the hull by 
designing curves that are a good match for the waterlines and sections. You are going to do this by proposing 
particular functional forms and then finding the best-fit curve that captures the hull data. You will also have 
the option of creating a surface representation for the hull. 


Exercise 31.1 


Review the boat lines for the Spray. The waterlines are shown in the plan view in the central figure, 
and the sections are shown in the section view in the bottom figure. The buttocks are shown in the 
profile view in the top figure. Observe that data for the waterlines and the sections is presented in the 
table on the left: the waterlines are read from left to right, while the sections are read from bottom 
to top. 


1. The waterlines are the curves defined by the intersection of the hull of the boat with horizontal 
slices at different heights. Which intersections do the sections and buttocks correspond to? 


. Propose a rectangular coordinate system (xyz) for the Spray, and discuss at least three options 
for where you might locate the origin. What units are used in the plans? 


. Trace out the waterline called 18B on your lines plan and plot the data points from the table on 
the left. Now propose a quadratic function, e.g. y = g(x — h)? + k, to describe the waterline 
curve you have visualized. Estimate some of the function parameters that define your curve. 
Visualize your curve and compare it to your data points. 
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. Now trace the section curve defined at station 2 in the section view and plot the data points 
from the table on the left. Now propose a power function, e.g. y = ax, to describe it. 
Estimate some of the function parameters that define your curve. Visualize your curve and 
compare it to your data points. 


. Now find the best-fit parameters for each of your curves using the technique outlined in the 
homework. For the power function you will need to convert it to a more suitable form by 
taking a logarithm. 


. Choose another waterline curve and another section curve. Can you tweak the parameters of 
the same functions to find a good fit? 


. Using the data in the table on the left, can you propose a function that would represent the 
surface of the hull? Go ahead and find the best-fit parameters. 
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32.1 Overview and Orientation [3 mins] 


In designing the hull of a boat, we will be thinking about how to compute quantities like area, volume, center 
of mass and center of buoyancy. These quantities involve the concept of integration. Today, we will start by 
going over basic properties of differentiation and integration involving functions of one variable. We will 
introduce the concept of multiple integrals in the next class, which will enable you to calculate quantities 
such as volume, mass, center of mass, and center of buoyancy. 

We will work exclusively in Cartesian coordinates, and use explicit or implicit function representations 
for curves and surfaces. 


32.2 Resources [2 mins] 


There is no shortage of resources available to help you with derivatives and integrals. Any single-variable or 
multi-variable calculus book will deal with these topics and you might have some useful resources from 
your high school calculus class. We'll focus on using two popular online resources: 


« Khan Academy’s videos 
¢ Paul(not Ruvolo)’s online math notes 


Both of these resources have practice problems, which we strongly recommend doing. You may wish to 
check your answers using WolframAlpha. 


32.3. Single-variable calculus [40 mins] 


Big idea: Single-variable calculus hinges on one fundamental idea: There is an intimate connection between 
the slope of the tangent line to a curve, and the area under the curve. The slope of the tangent line to the 
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curve is the derivative and the area under the curve is the integral. They are connected by the fundamental 
theorem of calculus. 


32.3.1 Derivative 

Additional resources for this subsection: 
« Khan Academy: Derivative as slope of curve 
+ Khan Academy: Formal definition of derivative as a limit 
- Limit definition notes 


Consider an explicit function y = f(x). The slope of the tangent line at a point on the curve is the 
derivative of the function, defined as 


af _ 4 f@+h)= fe) 


dx h-0 h 


We often use the notation f’ (especially if the independent variable represents a spatial coordinate) or 7 
(often when the independent variable is time). Fortunately, we don’t have to compute the derivative of 
functions using limits anymore, because humans have been doing this for over 300 years, and the derivative 
of lots of functions can be expressed in terms of elementary functions. 





Figure 32.1: The derivative as a limit. 


What if we give you a derivative, and ask you to figure out the function that it came from? Now you are 
finding the anti-derivative. However, this language is not always used, and many people refer to this as the 
indefinite integral or just the integral. This is unfortunate since it presupposes the fundamental theorem of 
calculus, which probably means you didn’t even realize that this was a cool idea! Given that this terminology 
is widely used, we will just have to adopt it. 


32.3.2 Definite Integrals 


Additional resources for this subsection: 
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+ Khan Academy: Integrals as Riemann sums 


Again consider an explicit function y = f(a). The area of the region below the curve defined by 
y = f(x), above the line y = 0, and between the lines x = a and x = 8, is the definite integral of f from 
r=ator=—b, 


[ f(x) dx = tim Do fla Aa. 
a i=l 





Figure 32.2: The integral as a limit. 


32.3.3 The fundamental theorem of calculus and anti-derivatives 
Additional resources for this section: 

« Khan Academy: Fundamental theorem of calculus 

+ Khan Academy: Fundamental theorm of calculus and indefinite integrals 

+» Indefinitely integral notes 


The fundamental theorem of calculus (one of its forms anyway) states that 


/ " I(@) de = F(b) — F(a) 


where F is the anti-derivative (or indefinite integral if you insist) of f, or F’ = f. In other words, integrating 
the slope of a function between two points gives the change in the function between the end-points. For 
example, 


i cos(x) da = sin(x)|"/? = sin(/2) — sin(0) = 1 
0 
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Exercise 32.1 


1. Create a table of the five fundamental functions x”, sin(x), cos(x), exp(x), and In(z). List 
both their derivatives and their anti-derivatives (integrals). Include in your table at least one 
other example. Use any resource to find the derivatives and integrals. For example, you could 
type in WolframAlpha 


derivative of x4n 
integral of x4n 


. Consider the sketch of the function below. Now try to sketch the derivative and an anti- 
derivative. 





nD) 


32.4 Properties and Rules of Derivatives and Integrals [30 mins] 
There are some key properties and rules of derivatives and integrals that we use over and over again. We 
include them here for completeness and ask one or two simple questions about them. These are summarized 


below: 


Linearity of the Derivative and Integral (f and g are functions, c is a constant) 


(f+g! = fitg 
(cf)’ = ef’ 
b b b 
[soa = fac f g dx 


b b 
[ cfae c] fdx 


a 
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Exercise 32.2 


1. Use your table of fundamental functions and these properties to evaluate the derivative and 
integral of 42° + 3x? — 5x + 4. Verify your answer using WolframAlpha. 


2. Consider the sketch of the function y = x? + 2” + 1 below. Please find the shaded area. Note 
that the axes are not on the same scale here. 

















Chain Rule 


Exercise 32.3 


Use your table of fundamental functions and the chain rule to determine the derivative of (a? — 1)1°°. 


Verify your answer using WolframAlpha. 





Substitution Rule 


Exercise 32.4 


Use your table of fundamental functions and the substitution rule to evaluate ie 22x + 1 dx. Verify 
your answer using WolframAlpha. 
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Product Rule 


d dg. df 


“fla)g(e) = fo+9S 


Exercise 32.5 


Use your table of fundamental functions and the product rule to determine the derivative of x? sin(z). 


Verify your answer using WolframAlpha. 
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Solution 32.1 























fz) | # ff) de 
~* na”} (n 4-1) 
1, | Sine cos & — cos 2 
cosx | —sinz sin x 
expx | expx exp x 
Ina z ring — 2 

















xv 


Solution 32.2 


(4x + 3a? — 5a +4)! = (403)! + (3x?) + (—5ar)! + 4! 
= A(x?) + 3(a?)' — 5(x)’ + 4(2°)' 
= 4(3x") + 3(2%) — 5(1) + 4(0) 
= 1227 + 62 —5 


[ost +30? — 5044) de= f4a% det f 30% de+ [sede + | Ade 
a4 fidr+3 fo de—5 fode+4 fo de 


4 3 2 1 


=4() +35) - (5) +45) 


5 
=a" +2 — 5a? + 4a 








2. The y = 4 line intercepts the curve at x = —3 and x = 1. Therefore the area we are interested 
in is given by the difference between the area of the shaded rectangle in the figure below, and 
the integral of x? + 2x + 1 from —3 to 1. 
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y = X+2x+1 











1 1 
1 
6— [ (a? + 22 + 1)dx = 16 — [poe to? +a] 
—3 —3 


= 16 





Solution 32.3 
To use the formula above for the chain rule, we let f(a) = 219° and u(x) = 2° — 1. Then 


<(@ — 1/109 — £ F(u(2)) = f'(u(x))u' (x) = 1000(x° — 1)99°(3a?) = 300027 (x3 — 1)99°. 


Solution 32.4 
To use the formula above for the substitution rule, we let u(x) = 2a + 1 and f(x) = \/z. Then 


' . us/? 52 
| 2V20+T dx = f f(u(z))u' (x) a= ff Ju du = —_|? = =. 
0 0 a7" 3 

Solution 32.5 
To use the formula above for the product rule, let f(x) = x? and g(x) = sin(x). Then 


d d d d 
ae sin(x) = dat (t)9(@) = ee +9 = = x’ cos(x) + sin(a)(22). 
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33-1 Introduction [5 mins] 


We have seen how integrals can be used to find the area under a graph and between graphs. Moving forward, 
we would like to be able to find the volume and center of mass of objects whose surfaces are defined by 
functions. The tool we will use is multiple integrals. We shall start by introducing the idea of double integrals 
by using them to find areas between graphs (although we can do these by one dimensional integrals with 
subtraction). We will then extend these ideas to compute the volume of 3-dimensional objects whose surfaces 
are defined by functions in the homework. 


33-2 Areas as double integrals [50 mins] 


Additional resources for this section: 


+ Khan Academy: Vertical area between curves (This series of videos covers the case where you are 


computing [ { dy dz.) 


* Khan Academy: Horizontal area between curves (This video covers the case where you are computing 


J f dx dy.) 


First, we’re going to compute the area of a rectangle in the silliest way possible. 
Consider a rectangle enclosed by x = a, x = b, y = c, and y = d. The area of the rectangle according to 


[-o dx 


which (fortunately) evaluates to (b — a)(d — c). Consider the integrand, d — c. According to the fundamental 
theorem of calculus, this difference could be expressed as an integral 


d 
a—e= | dy 


calculus is 
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which is at the very least an interesting thing to do. Replacing this expression into the earlier one means the 
area of the rectangle could be expressed as a double integral 


b pd 
ff eae 


A word on notation: the inner integral is with respect to y, with limits defined by y = c and y = d. The 
outer integral is with respect to x, with limits defined by x = a and x = bD. Since we could have expressed 


the area as P 
/ (b—a) dy 


the area of the rectangle can also be expressed as the double integral 


d pb 
[ff exay 


Notice how the order of integration and corresponding limits have changed, but (presumably) the result 
hasn’t. 





Figure 33.1: Simple regions in the plane that enclose areas. 


Now consider a region enclosed by two functions y = f(a) and y = g(x) and two lines x = aand x = b. 
Using the same reasoning, the area of this enclosed region can be expressed as a double integral 


Consider the following double integral 
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The region of integration is defined between y = 0 and y = a, and from x = 0 to x = 1. If we sketch this 
region we see that it is a triangle, and it should have an area of 1/2. In WolframAlpha we can issue the 
request 


integral of 1 from y = 0 to y = x and x = 0 to x = 1 


and we will happily find that it returns 1/2 as the result. 


Exercise 33.1 


Sketch the regions of integration and compute the area of the enclosed region by evaluating the 
double integral. Verify the result using WolframAlpha. 


1. ik So dyda 


a i 3” dyda 


es) 


What if the region must be described by two functions x = f(y) and x = g(y), and two lines y = c and 
y = d? In this case we would integrate with respect to x first, and then with respect to y, 


d gy) 
‘ | dxdy 
e 4 f(y) 





Exercise 33.2 


Sketch the regions of integration and compute the area of the enclosed region by evaluating the 
double integral. Verify your answer using WolframAlpha. 


1. ofa ean dxdy 


Dy, ie fy? dady 


rr) 


For a given region in the plane, we are faced with a choice of the order of integration. Do we integrate 
with respect to x first and then y or vice versa? Since we are computing an area, the result should be the same, 
but often times one order of integration is much easier than another—sometimes one order of integration 
can’t even be evaluated exactly! 





Exercise 33.3 


Repeat the previous two questions, but change the order of integration. The hard part is redefining 


the limits of integration. 
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33-3 Wrap up, and preview multiple integrals for finding volumes 
[20 mins] 
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Solution 33.1 


1. The region is 














and the integral evaluates to 


1 pax 1 
| | ay dx = | x dx =1/2. 
o Jo 0 


2. The region is 

















and the integral evaluates to 
2 Ina 2 
| 7 dy dz = ii In(a) dx = In(2). 
1 Jo 1 


Solution 33.2 


1. The region is 
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yh 


«K 


and the integral is 


1 p2-y 1 
i dx dy= | (2-2) dy= 2-8 =1. 
0 y 0 


2. The region is 


Y 


and the integral is 


1 exp y 1 1 me 3 
| / ax dy = | (expy — y) dy = expy — zy"|g = exp(1) — =. 
0 y 0 2 2 


Solution 33.3 


1. To exchange the order of integration we have to integrate in two parts 


1 fax 2 p2Q—@ 1 2 1 l 
a aydc+ ff dy dc = [ ede+ f (Q-2)de= 545-1. 
0 0 1 0 0 1 2 2 


2. To exchange the order of integration we have to integrate in two parts 


1 © e 1 1 e it 
iu | ay de + f / ay de = [ vdc+ | (1 — In(x) )dz = = + exp(1) — 2. 
0 /O 1 /Y1n(a) 0 1 2 
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Chapter 34 


Homework 3: Derivatives, Integrals 
and Multiple Integrals 





Contents 
34.1 Areas enclosed by curves... 1.6... es 269 
34.2 Volumes enclosed by surfaces defined explicitly .................06. 271 
34.3 Applications of Double Integrals ... 2... 2. ee 273 
34.4 3D Physical Simulations using Matlab .. 2.2... 2. ee ee ee 274 





In class, we introduced a few properties of differentiation and integration. Here, we introduce one more 
integration technique, followed by learning about computing areas between curves. All of these concepts 
use single-variable calculus. We then transition to problems involving multiple integration starting from 
Section 34.2. 


Integration by Parts 


Exercise 34.1 


Use your table of fundamental functions and integration by parts to determine [- a x exp(—2) da. 
Verify your answer using WolframAlpha. 





General Practice 


Exercise 34.2 


Find the derivative and integral of the following functions, and verify your answers using Wolfra- 
mAlpha. Assume the following are constant values: A, k,m,a,b,n,w, 0,9, h. 


3b, 


f(t) = Ae™ 
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f(z) =ma+b 


f(x) = ax" +b 


f(t) = Asin(wt + ¢) 


f(x) =g(a@—h)’ +k 





34.1 Areas enclosed by curves 


Additional resources for this section: 
» Khan Academy videos and practice problems 


Single-variable calculus gives us the tools to compute the area of regions bounded by curves. Consider 
the region bounded on top by y = f(x), on the bottom by y = g(x), and on the sides by x = a and x = b. 
Appealing to the properties of integrals, the area of this region is 


b 
/ (f(a) — g(ar)) dx 


We should note that integration will return a signed area. For example, the integral will be negative if the 
value of the function g is greater than that of f. 
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Figure 34.1: The area defined by integration can be positive or negative. 


Exercise 34.3 


Consider the first four fundamental functions x”, sin(a), cos(x), and exp(x). For each function, 
sketch the region which is bounded above by the function, below by the x-axis and between x = 0 
and x = 1. Use an integral to find the area of the region, and use WolframAlpha to verify your 


calculations. To visualize the regions you could type the following in WolframAlpha 


plot 0 < y < x42 and 0 < x < 1 
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Exercise 34.4 


Consider the parabola defined by y = x”. Propose an integral that would determine the area enclosed 
on the top by y = H, H > 0, on the bottom by y = 2”, on the left by x = 0, and on the right by 
the intersection of the top and bottom functions. Evaluate it by hand and verify the result using 
WolframAlpha. 





CO) 


34.2 Volumes enclosed by surfaces defined explicitly 


Additional resources for this section: 
+ Khan Academy: Volume with cross sections — video series 


Consider the volume enclosed by the surfaces defined by z = f(x,y), z=0,% =a,4 =b,y =c, and 
y = d. How would we compute the volume of this region? One option would be to slice the surface up in 
sections parallel to one of the coordinate planes. For example, if we make a slice at x = 2%, then each planar 
region is bounded by z = f(x1,y), y =, and y = d. The area of the enclosed region is therefore 


[seu dy 
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Figure 34.2: Volume defined as a double integral over a rectangle in the plane. 


If we repeat this for different values of x, then we could compute the area of each cross-section as a 
function of x 


d 
area(s) = / tle,y) dy 


What happens if we now integrate the area function from + = a to x = b? We should get a volume, and it 
should be the volume of the original enclosed region, 


Volume = / 


As we saw earlier, it shouldn’t matter whether we change the order of integration. 

Notice that the region of integration is the rectangle in the plane defined by (x, y) € [a, b] x [c, d]. There 
is no reason that the integral can’t be computed over more general regions. In general we will use the 
coordinate-free notation 

‘i fdA 
D 


to define the integral of a function over a general region D in the plane. 


b 


area(x) dz = [ [ f(x,y) dydx 





Figure 34.3: Volume defined as a double integral over a general region in the plane. 
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For example, consider the following integral 


4 
[eau D={(zx,y)\l<a2<2,0<y <2} 
x42 
D 


To sketch this region we draw a line at y = 0 and a line defined by y = 2x. We add lines at ¢ = 1 and 
x = 2. The region of integration is enclosed by these definitions. We can evaluate this integral using 
WolframAlpha by issuing the following request 


integral of 4y/(x434+2) from y = 0 to y = 2x and x = 1 to x = 2 


and we find that the result is approximately 3.21059. 


Exercise 34.5 


Sketch the following region of integration in the plane, and evaluate the integral using WolframAl- 
pha 


J veosy dA, 


D 





where D is bounded by y = 0,y = «7,4 =0,2 = 1. 


E===___=) 


34.3 Applications of Double Integrals 








So far we have been thinking exclusively in terms of geometry. In the same way that single integrals 
are used widely to compute quantities that are not areas per se, we can use double integrals to compute 
physically-relevant quantities like mass, center of mass, etc. 

As an example, consider a thin plate (thickness Hem) in 2D with variable mass density pgm/cm®, ie. the 
plate could be made of different material with a mass density that varies from location to location. The total 
mass M of the plate is represented by a double integral of the mass density over the plate. In coordinate-free 


notation, we can write 
M=H / | pdA 
D 


where D is the region in the plane occupied by the plate, and we evaluate the double integral depending 
on how we describe the plate. Likewise, the center of mass can be expressed as a double integral, and the 


AH 

Soom = 54 ff ee dA 
A 

tom = 5z ff yp aa 


Exercise 34.6 


relevant expressions are 


Find the total mass and center of mass of the 1 cm thin aluminum plate bounded by the parabola 


y = x”, y = 10, and x = 0. Assume zx and y are measured in centimeters. Use WolframAlpha to 
confirm your answer. 
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34.4 3D Physical Simulations using Matlab 


In this exercise you will be installing some software that will allow you to run 3D physical simulations 
through MATLAB. These simulations will allow you to explore some of the key ideas in the boats module in 
an interactive fashion. Additionally, we will be using the same setup when we come to the robot module in 
QEAz2 next semester. In this exercise all we want you to do is attempt to get the software installed and running 
on your computer. If you run into any issues, please make sure to send an e-mail to paul.ruvolo@olin.edu 
describing the problem you are facing (Paul is the instructor in charge of helping troubleshoot the software). 
If you are unable to get the software fully setup by the end of this assignment, that is totally fine. We will 
work with you to make sure you get things up and running. 

We realize there is not a lot of motivation for the steps you need to do to install the software. 
There’s actually a lot of really cool technology under the hood, but we didn’t want to burden the 
class with a bunch of unnecessary information. If you are interested in learning more, e-mail 
paul.ruvolo@olin.edu, and if enough people are interested I'll make a video explaining the setup 
in more depth. 

In order to get the software setup, you should go through the instructions in the Meeto Your Neato 
document. Specifically, you should go through the following sections of that document. 


« Purpose of this how-to 
* Docker Setup 
* Downloading Required MATLAB Toolboxes 


« Connecting to the Simulated Robot 
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Solution 34.1 
To use the formula above for integration by parts, let f(x) = x and g(a”) = — exp(—2). Then 


| ; x exp(—a) dx = ‘i : f(e)g! (x) de 
= f(x)g(2) 2 - J * ole) f(a) de 





x)|} rf exp(—x) dx 


( 

= —x exp(—2)|7 — exp(—z)|? 
( 
( 








—2) + exp(—1) — exp(2) + exp(—1) 
= —3exp(—2) + 2exp(—1) 





Solution 34.2 











f i If 
Ae*t Ake* a ekt 
ma +b m ma? + br 





T 

ax” +b ana”! s 
Asin(wt +) | Awcos(wt + ¢) 4 cos(wt + ¢) 
g(a —h)? +k 2g(a — h) 3g(a —h)s + ke 


























Solution 34.3 


1 n+1 1 
| a” de = ~— ——— 
0 n+l n+l 


1 
[sine de = —cosz|} = 1 cos 
0 
1 
[ cose de =sinalf =sin1 
0 
1 
if exp x dx = exp 2|g = exp(1) — 1 
0 


Solution 34.4 


The top function is f(a) = H and the bottom function is g(x) = x7. The left limit is x = 0 and the 
right limit is 2 = WH, since this is where the top and bottom functions meet. To find the area of the 
region described in the problem we need to solve the integral 


VH 
i (H — x?) dx. 
0 
This gives 
Vit 
1 
| (H — 2”) dx = (Ha — rae) ie 
0 


3 
2 





3 


tyloo 


II 
wl ry 
x 
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Solution 34.5 


The region of integration is 














and so the integral will be 


il! a? 
| | xcosy dy dx 
0 Jo 


which we evaluate 


2 


1 fe 1 
i; | xcosy dy dx = if asiny|® dx 
0 Jo 0 


1 
-| x sin(a?) dx 
0 


tr 
= >i sin(w) du 
2 Jo 


1 
= 5(—cosu)|o 


= at — cos(1)) 


Solution 34.6 
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where p is the density of Aluminum in gm/cm? (about 2.7). 
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To find the x center of mass we compute 


3 10 py 
com = FAO dx d 
sum = Sao ff, ete 
3 2 


7 [ r 7 a 
~ pa(103/72)° J, 219 % 
10 
aoa ip Y ay 
2(103/2) Jy 2 
eee arr 

2(103/2) \ 4 
7.8 10? 
~ 2(103/2) \ 4 


3 
= gone 





x 1.186cm 


Notice that p cancels out, assuming it is uniform. In other words, the center of mass for an object of 
uniform density only depends on its geometry! 
To find the y center of mass we compute 


3 10 p/¥ 
‘com — OE ae dx d 
tom = saci, fy ovdeay 
3 10 <5 
= sas? | LY|n—9 dy 
3 10 : 
ee /2 q 
x0 I Eyles 
— See a yee 10 
2(103/2) \ 5/2 °° 


= 6cm 
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Chapter 35 


Week ga: Center of Mass for Discrete 
Objects 








Schedule 
35.1 COM in One Dimension [40 mins] ............ 2002 ee eee eee eee 279 
35.2 COM in Two Dimensions [35 mins] ........... 2.00. ee eee ee ee eae 282 
35.3 Preview of the Homework [10 mins] .............. 2.0002 e ee eee 283 
Introduction 


Gravity is a distributed force acting uniformly on all the mass within an object. For most purposes (i-e., when 
we don’t need to consider deformation or breakage of an object) we can consider the entire gravitational 
force to be acting upon the object at a single point: this point is the center of mass (COM), also referred to as 
center of gravity (COG). 


35-1 COM in One Dimension [40 mins | 


Let’s start by considering the COM for a collection of discrete, individual objects. For a set of objects each of 
mass m, located at vector positions r;, the center of mass of the set is defined to be: 


1 
'comM = sm tie s ML; (35-1) 
4 v4 


Equation 35.1 looks like an expression for a weighted average because the COM is the mass-weighted average 
position of an object or group of objects. 


Exercise 35.1 


On your Miro board, draw a picture and write a caption that uses an example to explain this 
mathematical definition (e.g., you might consider a physical system that has mass distributed at 
different points in space). 


CO) 


Now let’s start in one dimension. In this case we can define the vector position of the objects as r; = x,i 
and the vector COM as rcom = ®comi. Since vectors are only equal if they have equal components we 
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could write 





1 
ZCOM = s MX; 


Exercise 35.2 


Imagine you had a ruler with and several disk-shaped weights. On your Miro board, draw a picture 
of your ruler with the two disks positioned at random spots (you choose!) along the ruler (you can 
also decide the mass of each disk). Work with your team to calculate the location of the center of 
mass of the system comprised of the ruler and the two disks. Here are some things to think about 
when doing this problem. 


1. Considering that the weights have a significant size (i.e., they are not just points in space), 
what should you take for the position of the disks in the equation for COM? 


. Try using two disks which are the same mass. Then do the same exercise, same positions, 
with two disks which are very different. How does the position of the COM change with 
asymmetric masses? 


. Should you take into account the mass of the ruler? If so, what should you use for the position 
of the ruler? 


Exercise 35.3 


In this problem you'll be running a 3D physics simulation to help test your predictions from the 
previous exercise and continue to build your intuition. If you weren’t able to get the 3D simulator 
setup, hopefully at least one member of your group did. If some members of the group 
don’t have the simulator running, have one member with the simulator do a screen share 
for the rest of the team. 

Go to the Q9EASimulators folder in your MATLAB Drive (you should have added this folder to 
your MATLAB Drive as part of the week 3 homework. Add the Boats module folder to your path 
and start the simulator by running the following commands in the MATLAB command window. 


>> addpath('Boats'); 
>> qeasim start CoM 


If all went well, you should see an image of an empty simulated environment pop up. 
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xe t2000C 6 m5 *@: 


¥ Options 


If your web browser starts up a new tab, but you don’t see this picture, try reloading the page. 


1. Take a look at the code in the file Boats/discreteMassesid.m. After you’ve made 
sure you understand what the code is doing (there are comments throughout), run the command 
and observe what happens in your simulator window. 


>> discreteMassesid() 


Does the teeter totter behave as you would expect? Note: to replay the simulation (in case 
you missed it), click the Reset Wor1d button in the simulator window. 


Note: you can visualize the center of mass of the teeter totter by right clicking on the 
board and turning on the Center of Mass option. 


. In MATLAB, using the variables masses and positions, compute the center of mass 
of the system (ignoring the pivot). Try to use matrix multiplication (or dot products) to 
compute your answer (it’s okay to start with a loop, but matrix multiplication will give you 
a more reliable and faster way to perform the computation). Note: if you want to create 
a script or LiveScript to do this calculation, you can create one as long as it isn’t in 
your QEASimulators directory (since you don’t have permission to write to that 
directory). 


. Set the variable pivotOffset equal to your computed center of mass and recreate your 
teeter totter by running the following command. 


>> teeterTotter(masses, positions, massOfBoard, pivotOffset) ; 


What happens? Assuming you were right about the computed center of mass, what should 
happen? 


. Revisit your sketch of the disks and the ruler. See if you can confirm your computation of 
the center of mass of that system using the teeterTotter function. Note that you can 
change the mass of the teeter totter board (your ruler in this case) by changing the variable 
massOf Board (e.g., if you didn’t consider the ruler to be massless). 
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35-2 COM in Two Dimensions [35 mins] 


In two dimensions, we can express the position vectors of the objects as r; = 1;1+ y;j and the position of 
the COM as rcom = ®comi+t ycowm}). Again, vectors can only be equal if their components are equal 
which means 


1 
x =a ) max; ; 
COM sect ” AUP (35.4) 


1 
vCoM = 5 d, Mii (35.5) 


It’s worth noting, however, that the vector expression for the COM is much more convenient if we are 
already dealing with vectors, e.g. when computing with MATLAB. 


Exercise 35.4 


Imagine the following physical system in which you have a square, ikg table top fused to a massless 
post. On top of the table are several disks (also fused to the table top). You can assume that all objects 
are of uniform density. Here is what the system might look like. 


In Miro draw a picture of the table top (looking from the top down). Label the location of the post on 
your diagram. 


1. Where is the COM of your table? 


2. In Miro, mark out a grid on your drawing of the table (note the table is 4 meters by 4 meters). 
Consider two hypothetical disks (you can decide their masses, or you can make them the same 
to make your life easier). Using the above formulas and the table COM as the origin, work out 
positions to place these two weights such that the COM of your table stays at the origin. Try 
to do this with the disks moved off of center in both x and y directions (not just in x or y). 


. Now move the disks to two arbitrary positions and compute the COM of the system consisting 
of the table and the disks. 


. Now move your origin to the corner of the table and re-compute the position of the center of 
mass for the system. Does the choice of origin affect the position of the COM relative to the 
objects? How would you choose a convenient origin for a COM calculation? 


. Take a look at the code in the function discret eMasses 2d, which generates the table 
and weights system (the one in the diagram above). Before running the code, predict what 
will happen to the table when you run the code (e.g., will it tip over? will it stay standing up? 
If it will tip over, in which direction will it tip?). 


. Write MATLAB code to compute the 2D center of mass of the table and mass system from the 
variables masses and positions. As in the previous exercise you should try to utilize 
matrix multiplication (as opposed to loops) when possible. 
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7. (optional) Revisit your predictions in the previous problems and use the function 
discreteMasses2d to test these predictions. You can do this by modifying the vari- 
ables masses and positions and then running the following code. 


tableSystem(masses, positions, massOfTableTop) ; 


CO) 


35-3 Preview of the Homework [10 mins] 





We'll come back together to show you what lies ahead for you on this week’s assignment. 
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Solution 35.1 


Lots of pictures could provide an excellent visual explanation of the mathematical definition for 
center of mass. The picture in Figure 35.1 is of two children of different masses on a seesaw. From 
intuition built on experience, we know that to balance the seesaw, the two children would have to sit 
different distances from the fixed pivot point. The mathematical definition of the COM is basically 
the inverse of this problem- for known positions and masses, where would the pivot point need to 
be placed in order to achieve the condition of perfect balance? 





Figure 35.1: Two children of different mass balancing a seesaw by adjusting their distance from the 
fixed pivot point. 


Solution 35.2 


1. Considering that the disks have a significant size, what should you take for the position of the 
disks in the equation for COM? Take the position of their center of mass as their position. 


2. Try using two disks which are the same mass. Then do the same exercise, same positions, 
with two disks which are very different. How does the position of the COM change with 
asymmetric masses? The center of mass of the system will shift towards the disk with larger 
mass. 


3. Should you take into account the mass of the ruler? If so, what should you use for the position 
of the ruler? Yes, and you should use the COM of the ruler! (See a pattern?) 


Solution 35.3 


1. The center of mass of the teeter totter would be given by the following equation. 


(10kg) (1.5m) + (1kg)(—1m) + (2kg)(—1.5m) 


10kg + lkg + 2kg (35.2) 





ZCOM = 


il 
= 73 (35.3) 
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Since the center of mass is positive we would expect it to rotate so that the positive side crashes 
to the ground (the positive side is the one with the large mass). 


2. >> dot(positions, masses)/dot(masses, ones(size(masses) )) 


3. >> pivotOffset = 11/13; % that's CoM as computed above 
>> teeterTotter(masses, positions, massOfBoard, pivotOffset) ; 


Solution 35.4 
1. Where is the COM of your table? The table’s COM is in the middle of the table. 


2. In Miro, mark out a grid on your drawing of the table (note the table is 4 meters by 4 meters). 
Consider two hypothetical disks (you can decide their masses, or you can make them the same 
to make your life easier). Using the above formulas and the table COM as the origin, work 
out positions to place these two weights such that the COM of your table stays at the origin. 
Try to do this with the disks moved off of center in both x and y directions (not just in x or y). 
there are a variety of ways to do this. A key thing to notice is that each of the components of 
the center of mass can be treated more or less independently (the position along the other axis 
doesn’t affect these calculations) 


3. Now move the disks to two arbitrary positions and compute the COM of the system consisting 


of the table and the disks. 


4. Now move your origin to the corner of the table and re-compute the position of the center of 
mass for the system. Does the choice of origin affect the position of the COM relative to the 
objects? How would you choose a convenient origin for a COM calculation? 


5. Take a look at the code in the function discret eMasses 2d, which generates the table 
and weights system (the one in the diagram above). Before running the code, predict what 
will happen to the table when you run the code (e.g., will it tip over? will it stay standing up? 
If it will tip over, in which direction will it tip?). The center of mass in the x direction is o and 
0.5909 in the y-direction. This means the table will tip in the direction of positive y 


6. Write MATLAB code to compute the 2D center of mass of the table and mass system from the 
variables masses and positions. As in the previous exercise you should try to utilize 
matrix multiplication (as opposed to loops) when possible. 


>> (positions*masses)/dot(masses, ones(size(masses) ) ) 
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Week 4b: COM for Continuous Objects 
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36.1 Conceptual COM for Continuous Objects [20 minutes] ................ 286 
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36.1 Conceptual COM for Continuous Objects [20 minutes] 


Grab a bunch (3-5) of objects lying around in your room - make sure you choose some that "look" complicated. 
Work through the following questions with your teammates, discussing your findings as you go. 


Exercise 36.1 


. Look at the distribution of mass on the object, including any regions made of differing materials. 
Can you predict where the center of mass should be? 


. Keeping in mind that the center of mass is a point in three dimensions, how can you experi- 
mentally locate the center of mass in all three dimensions? Look up the plumb bob method, 
and explain why it works. 


. Using a couple of objects, compare predictions and measurements of center of mass. How well 
do you do at guessing the center of mass for complex objects? 


. Some objects have symmetry: reflection symmetries, rotation symmetries. What do these 
symmetries tell you about the center of mass position? 


. Does the center of mass of an object have to be contained within the object? Can you find an 
example where the center of mass is not within the object? 


. Some objects can be considered to be made up of a system of separate objects, which can make 
it easier to find the composite center of mass. Can you find (or make) an example? How would 
you find the center of mass of the whole, if you can find the center of mass of each of the 
parts? (Write a mathematical expression for this). 
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36.2 Calculation of COM for Continuous Objects [60 minutes | 


We can consider a continuous object to be made up of a sum of very small discrete objects. In this exercise, 
you will use this concept to compute the center of mass for a “two dimensional” object. For the next exercise 
consider the following set of shapes 














7 y= ‘ y=x7/4 
4 4 
2 2 
0 0 
-2 -2 
5 0 5 -5 0 5 















































6 y = |sin(x x /2)+x| 6 y = 2exp(x/4) 
4 4 
2 2 
0 0 
2 2 
5 0 5 5 0 5 


Let’s assume that the lengths are measured in inches and that the material is hardboard with an area 
density of 1.82 grams/square inch. Each member of your team should choose one of the shapes. 


Exercise 36.2 
. What is the area AA of each little rectangle? What is the mass Am of each little rectangle? 


. The area of the object is found by summing up the all the little areas Aiozay = >> AA, and the 
total mass is found by summing up all the little masses miotar = > Am. Calculate the total 
area and total mass of your object. 


. In order to find the center of mass of the object, we have to multiply each mass m; = Am by 
the position vector of that mass square before summing. Choose an origin which you think is 
convenient, and find the COM by evaluating the COM equation for a continuous object: 


1 


rcom = Am r; (36.1) 
Mtotal yy : 


. If we subdivide the shapes further and further we will find that the sums turns into integrals. 
If the object density is p then Am = pAA and the COM becomes 


, dA 
rCcoM = ae (36.2) 


Each of the shapes is bounded by simple curves listed in the title of each shape. Set-up and 
evaluate the relevant double integrals to compute the COM, and compare with your estimation 
from earlier. 
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Exercise 36.3 


Now that you’ve had a chance to compute the center of mass of a few 2D shapes, we'll be putting 
your knowledge to the test in the COM Game™. Go ahead and start the COM game by navigating 
to your QEASimulators directory and running the following command (If you’ve already added the 
Boats directory to your path, you can skip the first command). 


>> addpath('Boats'); 
>> qeasim start comgame 


If all went well, a browser tab should pop up that looks like this. 


The goal of the COM game is to move each flat sheet so that when the physics simulation is un-paused 
(see the play button in the simulator window), the sheets will rest on top of the posts. Use this link 
to find a demo of how to play the game (note: if you are on Windows or Linux (rather than on Mac 
OSX, you can just click on the shape you want move. There is a bug in Mac OSX that makes it hard 
to do this). Also, I didn’t do all that well at this (ideally all of the sheets should stay up)! 

While it’s all well and good to play the game by hand, we’d like you to bring in the tools of multiple 
integrals that you learned earlier Compute the room for each sheet using multiple integrals (you 
can either do them by hand or use a computational tool like WolframAlpha). Each sheet is defined 
by the following region with —1 <a <1: 


Sheet Color | Bottom Curve | Top Curve 
Orange ay = [el ial 
Yellow l= y=1 
Green Gal? y=1 

Blue =e viel 
Purple Uap ca 











Notice the absolute value on the odd powers - you may wish to split the integral into two pieces for 
these cases (and if you are very clever you may be able to use symmetry to simplify your problem 
further!). 


Once you’ve computed a center of mass, you can position the sheet at your computed center of mass 
using the following MATLAB code (xcom = 0, ycom = 0 corresponds to the bottom center 
of each sheet). Note we put the sheet at the negative of its center of mass so that the center 
of mass shifts to the origin (which is located directly over the post). If you’d rather do it by 
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hand, you can certainly use the the graphical interface of the simulator to position the sheet so that 
its center of mass is above the post. For example if we computed the x center of mass to be 0 and the 
y center of mass to be 0.2m, we would use the following code to position the first (orange) sheet. 


>> sheetId = 1; % 1 is the Orange Sheet, 2 is the Yellow, etc. 
>> xcom = 0; 

>> ycom = 0.2; 

>> comGamePosition(1, -xcom, -ycom) ; 


Once you've positioned all of the sheets, unpause the simulation and see what happens! 


ee) 


36.3 Wrap-up (10 mins) 





We'll cover any common confusions we saw during the time in the breakout rooms. We'll also tell you what 
to expect for the final event. 
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Solution 36.1 


1. Look at the distribution of mass on the object, including any regions made of differing materials. 
Can you predict where the center of mass should be? 


2. Keeping in mind that the center of mass is a point in three dimensions, how can you experi- 
mentally locate the center of mass in all three dimensions? Look up the plumb bob method, 
and explain why it works. Gravity acts at the COM, and the object will rotate until the COM is 
directly below the hanging point. Therefore, each time you hang the object from a point, the 
vertical line through the object contains the COM. If you hang the object from two or more 
points and draw the vertical lines, the intersection of the lines will be the COM. 


3. Using a couple of objects, compare predictions and measurements of center of mass. How well 
do you do at guessing the center of mass for complex objects? 


4. Some objects have symmetry: reflection symmetries, rotation symmetries. What do these 
symmetries tell you about the center of mass position? The COM lies along the line of reflection 
for a 2-D object or on the plane of reflection for a 3-D object. For an object with rotational 
symmetry, the COM lies at the point of rotation for a 2-D object or along the line of rotation 
for a 3-D object. 


5. Does the center of mass of an object have to be contained within the object? Can you find an 
example where the center of mass is not within the object? Nope! A coffee cup is one common 
object where the COM is not on the object. 


6. Some objects can be considered to be made up of a system of separate objects, which can make 
it easier to find the composite center of mass. Can you find (or make) an example? How would 
you find the center of mass of the whole, if you can find the center of mass of each of the 
parts? (Write a mathematical expression for this). 


Solution 36.2 


Solution 36.3 
For each of the pth sheet (where p can be 1, 2, 3, 4, or 5) we have: 


1 61 
m= [, dydx (36.3) 
—1 a|P 
1h 
ZCOM = af [, xdydx (36.4) 
s —1J|a|P 


1 1 1 
= 6. 
YCOM = 77 Ll. ydydx (36.5) 


If we think about xcom, we can see that it will be o since each shape is symmetric about the y-axis 
(you could also evaluate the integrals to show this as well). As a second consequence of this symmetry 
we can divide each sheet into the portion on the left and the portion of the right of the y-axis and 
conclude that the y-component of the center of mass for each of these halves will be the same. Since 
these two centers of mass are the same (that of the left and the right half), we can work with the 
right half (« >= 0) since it will allow us to drop the absolute value around x and result in easier to 
evaluate integrals. 
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(36.6) 


(36.7) 


(36.8) 
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Chapter 37 


Homework 4: Center of Mass and 
Center of Buoyancy of Boats 
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In this homework assignment you will be exploring the concept of center of mass and center of buoyancy 
and how they relate to the motion of a solid submerged in water. These exercises will provide you with a 
conceptual and computational foundation for our explorations of boat stability next semester. 


37.1 Exploration of Buoyancy 


For this part of the assignment you will be working with a 3D physics simulation of a solid (let’s call it a 
boat) submerged in water. To start the simulator open MATLAB, go to the your QEASimulators folder (you 
should have this in your MATLAB drive as a part of the work you did for the last assignment), and run the 
following two commands in MATLAB’s command window. 


>> addpath('Boats'); 
>> qeasim start fourteen_boats 


If all went well, a web browser will pop up with this visualization of, you guessed it, 14 boats!! 





Figure 37.1: A visualization of a simulation of fourteen boats. 
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It will be useful to refer to each of these boats with a number (both when we run MATLAB code later, 
but also to refer to them in your answers). Here is a visualization with each of the boats labeled. 





Figure 37.2: Each of the fourteen boats labeled with its number. 


Boats 1-4, 13, and 14 are low density (their density is ; that of water). Boats 5-8 are medium density 
(their density is 4 that of water). Boats 9-12 are high density (their density is + that of water). 


Exercise 37.1 


Perform some experiments in the simulator and record your observations. We'll walk you through the 
most important actions you will need to perform in the simulator, but if you want a more complete 
rundown you can refer to the Gzweb user guide to learn how to use the interface. 


1. Unpause the simulation by clicking the play button. Watch the boats come to equilibrium 
(meaning they stop moving). Examine the waterlines of each boat (measured from the bottom 
of the boat to the water). How do these waterlines compare across differently shaped boats? 
How do these waterlines compare across boats of different density? 


. Observe the frequency of oscillations of each boat. To observe the boats dropping into the 
water again, you can reset the simulator state by clicking the “Reset Model Poses” button under 
the “Edit” menu. How do the frequencies of oscillation compare across differently shaped 
boats? How do the frequencies of oscillation compare across boats of different densities? 


. Next we'll be manipulating the state of the boats using MATLAB. We have provided a function 
called placeBoat. The first input to placeBoat is the number of the boat you want to 
manipulate (see the figure above for the mapping of numbers to boats), the second argument 
is the heel angle of the boat in degrees, and the third angle is the height of the bottom-center 
of the boat relative to the water. For instance placeBoat(1, 0, 1) would drop boat 1 
from a height of 1m (make to unpause the simulation to see what the boat does when placed 
in this configuration). Here is a diagram that shows each argument to placeBoat. 
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For now we want you to focus on explore the boat’s linear motion (i.e., with the boat held flat). 


Here are some experiments you might want to try. 


+ Drop a boat from some height above the water and observe what happens (e.g., run 


placeBoat (1,0, 2) to drop the first boat from a height of 2 meters above the water). 


Try this for a few different boats and record some observations. 


» Place a boat below the water and observe what happens (e.g., run 


placeBoat(1,0,-0.5) to place the first boat 0.5 meters below the water). 


Try this for a few different boats and record some observation. 


+ Determine the waterline of a boat by placing it at different depths. How do you know 
when you’ve found the waterline? Find the waterlines of a few of the boats and record 
your measurements. 


4. We have also provided a function called measureBoat that will tell you a particular boat’s 
current position and heel angle. The input to measureBoat is the number of the boat you’d 
like to measure. The function returns two outputs. The first output is a 3D column vector that 
holds the x, y, and z location of the boat in meters. The second output is the heel angle in 
degrees of the boat. For example, if yourun [pos, heel] = measureBoat(1) then 
you should get back the current position and heel angle of boat 1. 


Using measureBoat determine the waterlines of a few of the boats. 


Exercise 37.2 


Next, we'll examine the rotational motion of the boats. 


1. Using the function placeBoat rotate some of the boats by a small angle (try 10 degrees or 


less) (set the boat’s depth using the waterline you found for the boat when it was floating flat). 


What happens to each boat when you rotate them a small amount (e.g., do they do they rotate 
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after being given a small heel angle? Do they bob up and down?) Write down any interesting 
patterns you see across the different boats. What happens when you rotate boats 13 and 14? 


. Using the function placeBoat rotate some of the boats by a medium angle (maybe 45 
degrees or so). Try to set the boat’s depth so they don’t bob too much (you may find that the 
waterline has changed significantly from when the boat was floating flat). Write down any 
interesting patterns you see across the different boats. 


. Using the function placeBoat find the angle of vanishing stability (AVS) of a few of the 
boats (you can decide exactly how many to do and how you want to record the results). 
Describe the procedure you use to find the AVS of each boat. Remember that the AVS is the 
heel angle at which the boat stops righting itself and instead flips over. When determining the 
AVS, you'll want to make sure the boat is floating at its natural waterline for that particular 
angle (i.e., initially the boat shouldn’t bob). Note any interesting patterns you see in the AVS 
values of different boats (e.g., how do the AVS values change with density?). 





CO) 


37.2 Computational Approaches to Calculating Center of Mass and 
Center of Buoyancy 


Now that you’ve performed some experiments to explore the behavior of the boats, you’re going to develop 
a computational approach to compute two quantities that will be of primary importance in predicting the 
motion of the boat under various conditions: the center of mass and the center of buoyancy. 

The table below describes the geometry and density of each boat. Note that we use a coordinate system 
where x goes across the boat, z is up and down, and y goes along the boat (i.e., in the direction perpendicular 
to the boat’s cross section). We use meters for units of length. 


t z(m) 










Deck Width (W) Boat Length (L) 





Deck Height (D) 





Bottom of Hull 


~<t > 





Figure 37.3: Our coordinate system and other conventions for describing the shapes of the boats. 
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Boat ID | Height (D) | Width (W) | Bottom of Hull | Length (L) | Density 
1 0.5m 1m peel 0.6m 250 kg/m? 
2 0.5m 1m D (zy 0.6m 250 kg/m? 
3 0.5m 1m D ( 0.6m 250 kg/m? 
4 0.5m 1m D (2 . 0.6m 250 kg/m? 
5 0.5m 1m piel 0.6m 500 kg/m? 
6 0.5m 1m D (Cay 0.6m 500 kg/m? 
7 0.5m 1m D Cae 0.6m 500 kg/m? 
8 0.5m 1m D (yr 0.6m 500 kg/m? 
9 0.5m 1m pil 0.6m 750 kg/m? 
10 0.5m 1m D (22) 0.6m 750 kg/m? 
11 0.5m 1m D ( =)" 0.6m 750 kg/m3 
12 0.5m 1m D Ca 0.6m 750 kg/m?3 
13 1.5m 1m piel 0.6m 250 kg/m? 
14 1.5m 1m D (= 0.6m 250 kg/m? 

















Exercise 37.3 


First let’s compute the center of mass of one of our boats. In Week 4b we saw two different approaches 
for solving this problem. The first approach we saw was to chop up the boat into a bunch of little 
rectangles and treat each of those rectangles as a discrete mass. The second approach we saw was 
to use a double integral to compute the center of mass. In this exercise we're going to use the first 
approach (chopping up the boat in rectangles) for reasons that will become clear as you move through 
the problem. Since each of these boats is made from a simple 2D shape that has been extruded along 
its y-axis, we'll analyze the 2D cross section of the boat and use that to compute the center of mass. 
Next semester you'll see how to extend this analysis to a 3D boat hull (or perhaps by the end of this 
you may already be able to see yourself how you might do this). 


1. Choose one of the boats. In MATLAB using the function meshgrid define a 2D grid that 
encompasses the entire cross section of the boat’s hull (it’s okay if the meshgrid extends 
beyond the boat hull, but make sure you it fully covers the hull). Take the resulting matrices 
for your x and z points and reshape them into a matrix that is 2 x N where N is the total 
number of points in your mesh grid (i.e., each point in your meshgrid should be represented as 
2 x 1 column vector in this matrix). 


Hint: you may find it useful to use the syntax X(: ) to convert the matrix X into a column 
vector by unrolling it columnwise (e.g.,if X = [1, 2; 3, 4] then X(: ) will give the 
column vector[1; 3; 2; 4]). 


. Create an N x 1 column vector that contains the mass of each of the N areas that comprise 
your mesh grid. Since some points in your mesh grid might lie outside the boat’s hull, you'll 
want to make sure that you give these areas a mass of o. 


Hint: 
hull. For example, if you have your meshgrid points in the 2 x N matrix P, the bottom of your 
boat’s hull defined by the curve z = x”, and the top defined by the curve z = 1, you could use 
the following code to create a Boolean vector with a 1 when the corresponding column of P 
represents a point in the hull and 0 when the point is outside the hull. 


You can use Boolean expressions in MATLAB to test whether or not a point is in your 


insideBoat = P(2,:) >= P(1,:).42 & P(2,:) <= 1; 
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Hint 2: To make sure you did things correctly you can visualize your function 
insideBoat using the scatter function. 


scatter(P(1,insideBoat), P(2,insideBoat) ); 


. Compute the 2D center of mass of the boat using your 2 x N matrix of meshgrid locations 
along with your N x 1 column vector of masses. If you want to use loops at first, go ahead, 
but perhaps try to use the matrix multiplication tools we learned about in module 1 to speed 
up and simplify your code (we did this same problem in Week 4b). 


Exercise 37.4 


Using one of the boat shapes (you can use the same one you did for the previous problem or choose 
a new one), compute the center of buoyancy of the boat assuming the water is at at the level z = d 
(if you are working towards improving your MATLAB programming chops, this would be a great 
opportunity to create a function that takes d as an input and returns the center of buoyancy of the 
boat as its output). The center of buoyancy of the boat is defined by the following formulas where a; 
is the area of the 7th region. 


(37.1) 


(37.2) 


Notice that these are the same formulas we saw for the COM except we sum over each area times its 
x-coordinate (e.g., for cog) instead of mass times the x-coordinate (e.g., as we did for xcom). To 
see the connection to the previous problem clearly, you may want to compute an N x 1 column 
vector that includes the area of each point in the mesh grid that is below the water and inside the 
boat hull (assign an area of o for any points that don’t meet these two criteria). 


Exercise 37.5 


Plot the center of mass and center of buoyancy of your boat after rotating the boat by some angle 
about the origin (recall that this is the heel angle of the boat). You can create a function that takes 
both the depth of the boat and the heel angle and creates the plot or if you are not yet comfortable 
with that, you can hardcode particular values. You should also visualize the boat hull in some manner 
(Using scatter as we suggested in a previous hint is a good approach. We show how to do this 
in the solutions notebook). In order to perform this analysis, all you need to do is rotate your boat 
about the origin by the appropriate angle (time to dust off your 2D rotation matrix from earlier in 
the semester!). 
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Solution 37.1 


Since this is intended to be exploratory, we’re not going to give a solution. We’d love to have some 


interesting discussion on the Teams General channel about what you are seeing in this exercise 
(we’re betting that this is the week’). 


Solution 37.2 


Since this is intended to be exploratory, we’re not going to give a solution. We'd love to have some 


interesting discussion on the Teams General channel about what you are seeing in this exercise 
(we’re betting that this is the week’). 


In the Boats 
solutions 


In the Boats 
solutions 


In the Boats 
solutions 


Solution 37.3 
Homework 4 MATLAB drive folder, there is a LiveScript notebook called 


. mLx with the solutions to this exercise. 


Solution 37.4 
Homework 4 MATLAB drive folder, there is a LiveScript notebook called 


. mLx with the solutions to this exercise. 


Solution 37.5 
Homework 4 MATLAB drive folder, there is a LiveScript notebook called 


. mLx with the solutions to this exercise. 


302 


